A k-mean clustering algorithm for mixed numeric and categorical data
Tóm tắt
Từ khóa
Tài liệu tham khảo
Frawley, 1992, Knowledge discovery in databases: an overview, AI Magazine, 213
Fayyad, 1996
F. Can, E. Ozkarahan, A dynamic cluster maintenance system for information retrieval, in: Proceedings of the Tenth Annual International ACM SIGIR Conference, 1987, pp. 123–131.
M. Eissen, P. Spellman, P. Brown, D. Bostein, Cluster analysis and display of genome- wide expression patterns, in: Proceeding of National Academy of Sciences of USA, vol. 95, 1998, pp. 14863–14868.
Duda, 1973
Jain, 1988
J.B. MacQuuen, Some methods for classification and analysis of multivariate observation, in: Proceedings of the 5th Berkley Symposium on Mathematical Statistics and Probability, 1967, pp. 281–297.
Huang, 1997, Clustering large data sets with mixed numeric and categorical values
Kaufman, 1990
R. Ng, J. Han, Efficient and effective clustering method for spatial data mining, in: Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 1994, pp. 144–155.
Huang, 1998, Extensions to the K-modes algorithm for clustering large data sets with categorical values, Data Mining and Knowledge Discovery, 2, 10.1023/A:1009769707641
M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of KDD’96, 1996.
Sander, 1998, Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications, Data Mining and Knowledge Discovery, 2, 169, 10.1023/A:1009745219419
Dunn, 1974, Some recent investigations of a new fuzzy partitional algorithm and its application to pattern classification problems, Journal of Cybernetics, 4, 1, 10.1080/01969727408546062
Bezdek, 1981
Huang, 1999, A fuzzy k-modes algorithm for clustering categorical data, IEEE Transactions on Fuzzy Systems, 7, 446, 10.1109/91.784206
C. Döring, C. Borgelt, R. Kruse, Fuzzy clustering of quantitative and qualitative data, in: Proceedings of NAFIPS, Banff, Alberta, 2004.
Fisher, 1987, Knowledge acquisition via incremental conceptual clustering, Machine Learning, 2, 139, 10.1007/BF00114265
Lebowitz, 1987, Experiments with incremental concept formation, Machine Learning, 2, 103, 10.1007/BF00114264
M. Gluck, J. Corter, Information, uncertainty, and the utility of categories, in: Proceedings of Seventh Annual Conference in Cognitive Society, 1985, pp. 283–287.
K. McKusick, K. Thomson, COBWEB/3: A portable implementation, Technical Report FIA-90-6-18-2, NASA Ames Research Center, 1990.
Reich, 1991, The formation and use of abstract concepts in design, 323
Biswas, 1998, ITERATE: A conceptual clustering algorithm for data mining, IEEE Transactions on Systems, Man, and Cybernetics, 28C, 219, 10.1109/5326.669556
Cheesman, 1995, Bayesian classification (AUTO-CLASS): Theory and results, Advances in Knowledge Discovery and Data Mining
S. Guha, R. Rastogi, S. Kyuseok, ROCK: A robust clustering algorithm for categorical attributes, in: Proceedings of 15th International Conference on Data Engineering, Sydney, Australia, 23–26 March 1999, pp. 512–521.
V. Ganti, J.E. Gekhre, R. Ramakrishnan, CACTUS-clustering categorical data using summaries, in: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999, pp. 73–83.
Modha, 2003, Feature weighting in k-mean clustering, Machine Learning, 52, 217, 10.1023/A:1024016609528
T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: An efficient data clustering method for very large databases, in: SIGMOD Conference, 1996, pp. 103–114.
Ankerst, 1999, Optics: ordering points to identify the clustering structure, 49
S. Guha, R. Rastogi, K. Shim, CURE: An efficient clustering algorithm for clustering large databases, in: Proceedings of the Symposium on Management of Data (SIGMOD), 1998.
Karypis, 1999, CHAMELEON: A hierarchical clustering algorithm using dynamic modeling, IEEE Computer, 32, 68, 10.1109/2.781637
Li, 2002, Unsupervised learning with mixed numeric and nominal data, IEEE Transactions on Knowledge and Data Engineering, 14, 673, 10.1109/TKDE.2002.1019208
Huang, 2005, Automated variable weighting in k-mean type clustering, IEEE Transactions on PAMI, 27, 10.1109/TPAMI.2005.95
H. Luo, F. Kong, Y. Li, Clustering mixed data based on evidence accumulation, in: X. Li, O.R. Zaiane, Z. Li (Eds.), ADMA 2006, Lecture Notes on Artificial Intelligence 4093.
He, 2005, Scalable algorithms for clustering large datasets with mixed type attributes, International Journal of Intelligence Systems, 20, 1077, 10.1002/int.20108
He, 2002, Squeezer: An efficient algorithms for clustering categorical data, Journal of Computer Science and Technology, 17, 611, 10.1007/BF02948829
Stanfill, 1986, Toward memory based reasoning, Communication of the ACM, 29, 1213, 10.1145/7902.7906
Witten, 2000
P. Andritsos, P. Tsaparas, R.J. Miller, K.C. Sevcik, LIMBO: Scalable clustering of categorical data, in: 9th International Conference on Extending DataBase Technology (EDBT), March 2004.
Ahmad, 2007, A method to compute distance between two categorical values of same attributein unsupervised learning for categorical data set, Pattern Recognition Letters, 28, 110, 10.1016/j.patrec.2006.06.006
Ahmad, 2005, A feature selection technique for classificatory analysis, Pattern Recognition Letters, 26, 43, 10.1016/j.patrec.2004.08.015
Basak, 1998, Unsupervised feature selection using a neuro-fuzzy approach, Pattern Recognition Letters, 19, 997, 10.1016/S0167-8655(98)00083-X
Yeung, 2002, Improving performance of similarity-based clustering by feature weight learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 556, 10.1109/34.993562
Sonbaty, 1998, Fuzzy clustering for symbolic data, IEEE Transaction on Fuzzy Systems, 6, 195, 10.1109/91.669013
A. Ahmad, L. Dey, A K-mean clustering algorithm for mixed numeric and categorical data set using dynamic distance measure, in: Proceedings of Fifth International Conference on Advances in Pattern Recognition, ICAPR2003, 2003.
Won, 2005, A k-populations algorithm for clustering categorical data, Pattern Recognition, 38, 1131, 10.1016/j.patcog.2004.11.017
Penã, 1999, An empirical comparison of four initialization methods for the K-mean algorithm, Pattern Recognition Letters, 20, 1027, 10.1016/S0167-8655(99)00069-0
Bradley, 1998, Refining initial points for K-mean clustering, 91
Khan, 2004, Cluster center initialization algorithm for K-mean clustering, Pattern Recognition Letters, 25, 1293, 10.1016/j.patrec.2004.04.007
Yang, 1999, An evaluation of statistical approaches to text categorization, Journal of Information Retrieval, 1, 67