On the Meaningfulness of “Big Data Quality” (Invited Paper)
Tóm tắt
Từ khóa
Tài liệu tham khảo
Batini C, Scannapieco M (2015) Data and information quality. Dimensions, principles and techniques. Springer, New York
Batini C, Palmonari M, Viscusi G (2012) The many faces of information and their impact on information quality. In: Proceedings of the 17th international conference on information quality (IQ 2012)
Bizer C (2007) Quality-driven information filtering in the context of Web-based information systems, PhD thesis. Freie Universität Berlin, March 2007
Bizer C, Heath T, Berners-Lee T (2009) Linked data—the story so far. Int J Semant Web Inf Syst 5(3):1–22
Chall JS (1995) Readability revisited. The new Dale-Chall readability formula, vol 118. Brookline Books, Cambridge
Cohen W, Ravikumar P, Fienberg S (2003) A comparison of string metrics for matching names and records. KDD Workshop Data Clean Object Consol 3:73–78
Crosby PB (1979) Quality is free. McGraw-Hill, New York
Dalvi N, Machanavajjhala A, Pang B (2012) An analysis of structured data on the web. Proc VLDB Endow 5(7):680–691
de Ridder H, Endrikhovski S (2002) Image quality is fun: reflections on fidelity, usefulness and naturalness. SID Symp Dig Tech Pap 33:986–989
Dong XL, Saha B, Srivastava D (2013) Less is more: selecting sources wisely for integration. In: Proceedings of the 39th international conference on very large data bases, PVLDB’13. VLDB Endowment, pp 37–48
DuBay WH (2004) The principles of readability. http://www.impact-information.com/impactinfo/readability02.pdf
Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Trans Knowl Data Eng 19(1):1–16
Fan W, Geerts F (2012) Foundations of data quality management. Synthesis lectures on data management. Morgan & Claypool, San Rafael
Farr JN, Jenkins JJ, Paterson DG (1951) Simplification of flesch reading ease formula. J Appl Psychol 35(5):333
Fellegi IP, Holt D (1976) A systematic approach to automatic edit and imputation. J Am Stat Assoc 71(353):17–35
Flemming A (2011) Qualitätsmerkmale von Linked Data-veröffentlichenden Datenquellen. Diplomarbeit (Quality Criteria for Linked Data Sources) https://cs.uwaterloo.ca/~ohartig/files/DiplomarbeitAnnikaFlemming.pdf
Fürber C, Hepp M (2011) Swiqa—a semantic web information quality assessment framework. In: Proceedings of the ECIS
Gal A (2015) Big data integration. In: Keynote speech at international conference on open and big data (OBD 2015), August 2015, IEEE CS Press
Gonzales RC, Woods RE (2008) Digital image processing. Prentice Hall, Englewood Cliffs
Gunning R (1952) The technique of clear writing. McGraw Hill International Book, New York
Hogan A, Umbrich J, Harth A, Cyganiak R, Polleres A, Decker S (2012) An empirical survey of linked data conformance. J Web Semant 14:14–44
Hua W, Wang Z, Wang H, Zheng K, Zhou X (2015) Short text understanding through lexical-semantic analysis. In: Poster at ICDE 2015
Ipeirotis PG, Gravano L (2002) Distributed search over the hidden web: hierarchical database sampling and selection. In: Proceedings of the 28th international conference on very large data bases. VLDB Endowment, pp 394–405
International Organization for Standardization - ISO. Quality management and quality assurance. Vocabulary. ISO 84021994
Jacobi I, Kagal L, Khandelwal A (2011) Rule-based trust assessment on the semantic web. In: International conference on Rule-based reasoning, programming, and applications series, pp 227–241
Juran JM (1988) Juran on planning for quality. The Free Press, New York
Kincaid JP, Fishburne RP Jr, Rogers RL, Chissom BS (1975) Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, DTIC Document
Kitson HD (1921) The mind of the buyer: a psychology of selling, vol 21549. Macmillan, New York
Lei Y, Uren V, Motta E (2007) A framework for evaluating semantic metadata. In: Proceedings of the 4th international conference on knowledge capture, ACM
Li Q, Li Y, Gao J, Zhao B, Fan W, Han J (2014) Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data
Li X, Dong XL, Lyons K, Meng W, Srivastava D (2012) Truth finding on the deep web: is the problem solved? Proc VLDB Endow 6(2):97–108
Manzoor A, Truong HL, Dustdar S (2008) On the evaluation of quality of context. In: Smart sensing and context. Springer
Mendes P, Mühleisen H, Bizer C (2012) Sieve: linked data quality assessment and fusion. In: Proceedings of the 2012 joint EDBT/ICDT workshops
NASSCOM (2012) Big data—the next big thing. Technical report, NASSCOM (2012)
Raghavan S, Garcia-Molina H (2001) Crawling the hidden web. In: Proceedings of the 27th international conference on very large data bases
Rekatsinas T, Dong XL, Srivastava D (2014) Characterizing and selecting fresh data sources. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data
Senter RJ, Smith EA (1967) Automated readability index. Technical report, DTIC Document
Sha K, Shi W (2008) Consistency-driven data quality management of networked sensor systems. J Parallel Distrib Comput 68(9):1207–1221
UNECE. Classification of types of big data. http://www1.unece.org/stat/platform/display/bigdata/Classification+of+Types+of+Big+Data . Accessed Aug 2015
W3C. http://www.w3.org/WAI/ . Accessed Aug 2015
Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 12(4):5–34
Wayne SR (1983) Quality control circle and company wide quality control. Qual Prog 16(10):14–17
Wu FJ, Kao YF, Tseng YC (2011) From wireless sensor networks towards cyber physical systems. Pervasive Mobile Comput 7(4):397–413
Wu W, Yu C, Doan A, Meng W (2004) An interactive clustering-based approach to integrating source query interfaces on the deep web. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data
Zakaluk BL, Samuels SJ (eds) (1988) Readability: its past, present, and future. International Reading Association, Newark