Selecting the right objective measure for association analysis

Information Systems - Tập 29 - Trang 293-313 - 2004
Pang-Ning Tan1, Vipin Kumar1, Jaideep Srivastava1
1Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USA

Tài liệu tham khảo

R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases, in: Proceedings of 1993 ACM-SIGMOD International Conference on Management of Data, Washington, DC, May 1993, pp. 207–216. Agrawal, 1993, Database mining, IEEE Trans. Knowledge Data Eng., 5, 914, 10.1109/69.250074 Mosteller, 1968, Association and estimation in contingency tables, J. Am. Stat. Assoc., 63, 1, 10.2307/2283825 Agresti, 1990 G. Piatetsky-Shapiro, Discovery, analysis and presentation of strong rules, in: G. Piatetsky-Shapiro, W. Frawley (Eds.), Knowledge Discovery in Databases, MIT Press, Cambridge, MA, 1991, pp. 229–248. R.J. Hilderman, H.J. Hamilton, B. Barber, Ranking the interestingness of summaries from data mining systems, in: Proceedings of the 12th International Florida Artificial Intelligence Research Symposium (FLAIRS’99), Orlando, FL, May 1999, pp. 100–106. Hilderman, 2001 I. Kononenko, On biases in estimating multi-valued attributes, in: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI’95), Montreal, Canada, 1995, pp. 1034–1040. R. Bayardo, R. Agrawal, Mining the most interesting rules, in: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, San Diego, CA, August 1999, pp. 145–154. M. Gavrilov, D. Anguelov, P. Indyk, R. Motwani, Mining the stock market: which measure is best? in: Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining, Boston, MA, 2000. Y. Zhao, G. Karypis, Criterion functions for document clustering: experiments and analysis. Technical Report TR01-40, Department of Computer Science, University of Minnesota, 2001. Goodman, 1968, Measures of association for cross-classifications, J. Am. Stat. Assoc., 49, 732, 10.2307/2281536 Yule, 1900, On the association of attributes in statistics, Philos. Trans. R. Soc. A, 194, 257, 10.1098/rsta.1900.0019 Yule, 1912, On the methods of measuring association between two attributes, J. R. Stat. Soc., 75, 579, 10.2307/2340126 Cohen, 1960, A coefficient of agreement for nominal scales, Educ. Psychol. Meas., 20, 37, 10.1177/001316446002000104 Cover, 1991 Smyth, 1991, Rule induction using information theory, 159 Breiman, 1984 R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases, in: Proceedings of the 20th VLDB Conference, Santiago, Chile, September 1994, pp. 487–499. P. Clark, R. Boswell, Rule induction with cn2: some recent improvements, in: Proceedings of the European Working Session on Learning EWSL-91, Porto, Portugal, 1991, pp. 151–163. S. Brin, R. Motwani, J. Ullman, S. Tsur, Dynamic itemset counting and implication rules for market basket data, in: Proceedings of 1997 ACM-SIGMOD International Conference on Management of Data, Montreal, Canada, June 1997, pp. 255–264. S. Brin, R. Motwani, C. Silverstein, Beyond market baskets: generalizing association rules to correlations, in: Proceedings of 1997 ACM-SIGMOD International Conference on Management of Data, Tucson, Arizona, June 1997, pp. 255–264. Silverstein, 1998, Beyond market baskets, Data Mining Knowledge Discovery, 2, 39, 10.1023/A:1009713703947 T. Brijs, G. Swinnen, K. Vanhoof, G. Wets, Using association rules for product assortment decisions: a case study, in: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, San Diego, CA, August 1999, pp. 254–260. C. Clifton, R. Cooley, Topcat: data mining for topic identification in a text corpus, in: Proceedings of the 3rd European Conference of Principles and Practice of Knowledge Discovery in Databases, Prague, Czech Republic, September 1999, pp. 174–183. W. DuMouchel, D. Pregibon, Empirical bayes screening for multi-item associations, in: Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining, 2001, pp. 67–76. Shortliffe, 1975, A model of inexact reasoning in medicine, Math. Biosci., 23, 351, 10.1016/0025-5564(75)90047-4 C.C. Aggarwal, P.S. Yu, A new framework for itemset generation, in: Proceedings of the 17th Symposium on Principles of Database Systems, Seattle, WA, June 1998, pp. 18–24. S. Sahar, Y. Mansour, An empirical evaluation of objective interestingness criteria, in: SPIE Conference on Data Mining and Knowledge Discovery, Orlando, FL, April 1999, pp. 63–74. P.N. Tan, V. Kumar, Interestingness measures for association patterns: a perspective, in: KDD 2000 Workshop on Postprocessing in Machine Learning and Data Mining, Boston, MA, August 2000. van Rijsbergen, 1979 Klosgen, 1992, Problems for knowledge discovery in databases and their treatment in the statistics interpreter explora, Int. J. Intell. Systems, 7, 649, 10.1002/int.4550070707 M. Kamber, R. Shinghal, Evaluating the interestingness of characteristic rules, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 1996, pp. 263–266. Hand, 2001 A. George, W.H. Liu, Computer Solution of Large Sparse Positive Definite Systems, Series in Computational Mathematics, Prentice-Hall, Englewood Cliffs, NJ, 1981.