Entity resolution based EM for integrating heterogeneous distributed probabilistic data

Journal of Systems and Software - Tập 107 - Trang 93-109 - 2015
Ramesh Dharavath1, Chiranjeev Kumar1
1Department of Computer Science and Engineering, Indian School of Mines, Dhanbad 826004, Jharkhand, India

Tài liệu tham khảo

Abell´o, 2002, On relationships offering new drill-across possibilities, 7 Agrawal, 2006, ‘Trio: a system for data, uncertainty, and lineage Akbarinia, 2012, Entity resolution for uncertain data Antova, 2009, 10(106 worlds and beyond: efficient representation and processing incomplete information, VLDB J, 18, 10.1007/s00778-009-0149-y Ayat, 2013, Entity resolution for uncertain data, 31, 5010 Benjelloun, 2006, Uldbs: databases with uncertainty and lineage BRITE. http://www.cs.bu.edu/brite/. Cheng, 2004, Querying imprecise data in moving object environments, IEEE Trans. Knowl. Data Eng., 16 Dalvi, 2004, Efficient query evaluation on probabilistic databases, 10.1016/B978-012088469-8.50076-0 David, 2006 Elmagarmid, 2007, Duplicate record detection: a survey’, IEEE Trans. Knowl. Data Eng., 110, 10.1109/TKDE.2007.250581 Elissa, 2002, Special Issue on ‘Integration management’, IEEE Bull. Tech. Committee Data Eng., 25 Giannella, 2006, Efficient kernel density estimation over distributed data Hua, 2008, Ranking queries on uncertain data: a probabilistic threshold approach Jayram, 2006, Avatar information extraction system, IEEE Data Eng. Bull., 210 Kriegel, 2007, ‘Probabilistic nearest-neighbor query on uncertain objects Li, 2009, ‘Ranking distributed probabilistic data’ Lenzerini, 2002, Data integration: a theoretical perspective, 233 Magnani, 2007, ‘Uncertainty in data integration: current approaches and open problems Menestrina, 2006, Generic entity resolution with data confidences Miller., 2001, The Clio project: managing heterogeneity, SIGMOD Record, 30, 78, 10.1145/373626.373713 McClean, 2003, A scalable approach to integrating heterogeneous aggregate views of distributed databases, IEEE Trans. Knowledge Data Eng., 15, 232, 10.1109/TKDE.2003.1161592 McClean, 2000, Incorporating domain knowledge into attribute-oriented data mining, Int. J. Intell. Syst., 6, 535, 10.1002/(SICI)1098-111X(200006)15:6<535::AID-INT4>3.0.CO;2-9 McClean, 2001, Aggregation of imprecise and uncertain information in databases, IEEE Trans. Knowledge Data Eng.,, vol. 13, pp. 1002 Nan-Chen, 2005, Hybrid mining approach in the design of credit scoring models', Expert Syst. Appl., 28, 655, 10.1016/j.eswa.2004.12.022 Panse, 2010, Duplicate detection in probabilistic data Rahm, 2001, A survey of approaches to automatic schema matching, VLDB J., 10, 334, 10.1007/s007780100057 Song, 2007, 45 Sarawagi, 2002, Interactive deduplication using active learning Sarma, 2006, ‘Working models for uncertain data Singh, 2008, ‘Orion 2.0: native support for uncertain data Soliman, 2007, Proc. of ICDE Trajcevski, 2009, Continuous probabilistic nearest neighbor queries for uncertain trajectories Ye, 2010, Probabilistic top-k query processing in distributed sensor networks Yiu, 2009, Efficient evaluation of probabilistic advanced spatial queries on existentially uncertain data, IEEE Trans. Knowl. Data Eng., 21 Yuen, 2010, Superseding nearest neighbor search on uncertain spatial databases, IEEE Trans. Knowl. Data Eng., 22