Krimp: mining itemsets that compress
Tóm tắt
Từ khóa
Tài liệu tham khảo
Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI (1996) Fast discovery of association rules. In: Advances in knowledge discovery and data mining, AAAI, pp 307–328
Bathoorn R, Koopman A, Siebes A (2006) Reducing the frequent pattern set. In: Proceedings of the ICDM-workshops’06, pp 55–59
Bayardo R (1998) Efficiently mining long patterns from databases. In: Proceedings of SIGMOD’98, pp 85–93
Bringmann B, Zimmermann A (2007) The chosen few: on identifying valuable patterns. In: Proceedings of the ICDM’07, pp 63–72
Calders T, Goethals B (2002) Mining all non-derivable frequent itemsets. In: Proceedings of the ECML PKDD’02, pp 74–85
Chakrabarti D, Papadimitriou S, Modha DS, Faloutsos C (2004) Fully automatic cross-associations. In: Proceedings of KDD’04, pp 79–88
Chakrabarti S, Sarawagi S, Dom B (1998) Mining surprising patterns using temporal description length. In: Proceedings of VLDB’98, Morgan Kaufmann, San Francisco, pp 606–617
Chandola V, Kumar V (2007) Summarization—compressing data into an informative representation. Knowl Inf Syst 12(3): 355–378
Coenen F (2003) The LUCS–KDD discretised/normalised ARM and CARM data library. http://www.csc.liv.ac.uk/~frans/KDD/Software/LUCS-KDD-DN/DataSets/dataSets.html
Coenen F (2004) The LUCS–KDD software library. http://www.csc.liv.ac.uk/~frans/KDD/Software
Cover T, Thomas J (2006) Elements of information theory, 2nd edn. John Wiley and Sons, New York
Crémilleux B, Boulicaut JF (2002) Simplest rules characterizing classes generated by δ-free sets. In: Proceedings of KBSAAI’02, pp 33–46
Duda R, Hart P (1973) Pattern classification and scene analysis. John Wiley and Sons, New York
Faloutsos C, Megalooikonomou V (2007) On data mining, compression and Kolmogorov complexity. Data Min Knowl Discov 15(1): 3–20
Gionis A, Mannila H, Mielikäinen T, Tsaparas P (2007) Assessing data mining results via swap randomization. ACM Trans Knowl Discov Data 1(3): 14
Goethals B, Zaki MJ (2003) Frequent itemset mining implementations repository (FIMI). http://fimi.cs.helsinki.fi
Grünwald PD (2005) Minimum description length tutorial. In: Grünwald P, Myung I (eds) Advances in minimum description length. MIT Press, Cambridge
Hand, D, Adams, N, Bolton, R (eds) (2002) Pattern detection and discovery. Springer, New York
Heikinheimo H, Hinkkanen E, Mannila H, Mielikäinen T, Seppänen JK (2007) Finding low-entropy sets and trees from binary data. In: Proceedings of KDD’07, pp 350–359
Heikinheimo H, Vreeken J, Siebes A, Mannila H (2009) Low-entropy set selection. In: Proceedings of SDM’09, pp 569–579
Karp RM (1972) Reducibility among combinatorial problems. In: Miller R, Thatcher J (eds) Proceedings of a symposium on the complexity of computer computations. Plenum Press, New York, USA, pp 85–103
Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. In: Proceedings of KDD’04, pp 206–215
Keogh E, Lonardi S, Ratanamahatana CA, Wei L, Lee SH, Handley J (2007) Compression-based data mining of sequential data. Data Min Knowl Discov 14(1): 99–129
Knobbe AJ, Ho EKY (2006a) Maximally informative k-itemsets and their efficient discovery. In: Proceedings of KDD’06, pp 237–244
Kohavi R, Brodley C, Frasca B, Mason L, Zheng Z (2000) KDD-Cup 2000 organizers’ report: peeling the onion. SIGKDD Explor 2(2):86–98. http://www.ecn.purdue.edu/KDDCUP
Koopman A, Siebes A (2008) Discovering relational items sets efficiently. In: Zaki M, Wang K (eds) Proceedings of SDM’08, SIAM, pp 108–119
Koopman A, Siebes A (2009) Characteristic relational patterns. In: Proceedings of KDD’09, pp 437–446
Li M, Vitányi P (1993) An introduction to Kolmogorov complexity and its applications. Springer, New York
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceedings of KDD’98, pp 80–86
Liu G, Lu H, Yu JX, Wei W, Xiao X (2004) AFOPT: an efficient implementation of pattern growth approach. In: Proceedings of the 2nd workshop on frequent itemset mining implementations
Mannila H, Toivonen H (1996) Multiple uses of frequent sets and condensed representations. In: Proceedings of KDD’96, pp 189–194
Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data mining and knowledge discovery, pp 241–258
Mehta M, Agrawal R, Rissanen J (1996) Sliq: a fast scalable classifier for data mining. In: Advances in database technology. Springer, NY, pp 18–32
Meretakis D, Lu H, Wüthrich B (2000) A study on the performance of large bayes classifier. In: Proceedings of the ECML’00, pp 271–279
Mielikäinen T, Mannila H (2003) The pattern ordering problem. In: Proceedings of the ECML PKDD’03, pp 327–338
Mitchell-Jones AJ, Amori G, Bogdanowicz W, Krystufek B, Reijnders PJH, Spitzenberger F, Stubbe M, Thissen JBM, Vohralik V, Zima J (1999) The atlas of European mammals. Academic Press, London
Morik, K, Boulicaut, JF, Siebes, A (eds) (2005) Local pattern detection. Springer, New York
Myllykangas S, Himberg J, Böhling T, Nagy B, Hollmén J, Knuutila S (2006) Dna copy number amplification profiling of human neoplasms. Oncogene 25(55)
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the ICDT’99, pp 398–416
Pfahringer B (1995) Compression-based feature subset selection. In: Proceedings of the IJCAI’95 workshop on data engineering for inductive learning, pp 109–119
Quinlan J (1993b) C4.5: programs for machine learning. Morgan-Kaufmann, Los Altos
Siebes A, Vreeken J, van Leeuwen M (2006) Item sets that compress. In: Proceedings of SDM’06, pp 393–404
Sun J, Faloutsos C, Papadimitriou S, Yu PS (2007) Graphscope: parameter-free mining of large time-evolving graphs. In: Proceedings of KDD’07, pp 687–696
Tatti N, Vreeken J (2008) Finding good itemsets by packing data. In: Proceedings of the ICDM’08, pp 588–597
van Leeuwen M, Siebes A (2008) Streamkrimp: detecting change in data streams. In: Proceedings of ECMLPKDD’08, Springer, Heidelberg, pp 672–687
van Leeuwen M, Vreeken J, Siebes A (2006) Compression picks the item sets that matter. In: Proceedings of the ECML PKDD’06, pp 585–592
van Leeuwen M, Vreeken J, Siebes A (2009) Identifying the components. Data Min Knowl Discov 19(2): 173–292
Vreeken J, Siebes A (2008) Filling in the blanks—Krimp minimisation for missing data. In: Proceedings of the ICDM’08, pp 1067–1072
Vreeken J, van Leeuwen M, Siebes A (2007a) Characterising the difference. In: Proceedings of KDD’07, pp 765–774
Vreeken J, van Leeuwen M, Siebes A (2007b) Preserving privacy through data generation. In: Proceedings of the ICDM’07, pp 685–690
Wallace C (2005) Statistical and inductive inference by minimum message length. Springer, New York
Wang J, Karypis G (2005) HARMONY: efficiently mining the best rules for classification. In: Proceedings of SDM’05, pp 205–216
Wang J, Karypis G (2006) On efficiently summarizing categorical databases. Knowl Inf Syst 9(1): 19–37
Wang C, Parthasarathy S (2006) Summarizing itemset patterns using probabilistic models. In: Proceedings of KDD’06, pp 730–735
Warner H, Toronto A, Veasey L, Stephenson R (1961) A mathematical model for medical diagnosis, application to congenital heart disease. J Am Med Assoc 177: 177–184
Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. 2nd edn. Morgan Kaufmann, San Francisco
Xiang Y, Jin R, Fuhry D, Dragan FF (2008) Succinct summarization of transactional databases: an overlapped hyperrectangle scheme. In: Proceedings of KDD’08, pp 758–766
Xin D, Han J, Yan X, Cheng H (2005) Mining compressed frequent-pattern sets. In: Proceedings of VLDB’05, pp 709–720
Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Proceedings of KDD’05, pp 314–323
Yin X, Han J (2003) CPAR: Classification based on predictive association rules. In: Proceedings of SDM’03, pp 331–335