An extensive comparative study of cluster validity indices
Tóm tắt
Từ khóa
Tài liệu tham khảo
Halkidi, 2001, On clustering validation techniques, Journal of Intelligent Information Systems, 17, 107, 10.1023/A:1012801612483
Jain, 1988
Mirkin, 2005
Sneath, 1973
Holzinger, 1941
Chou, 2004, A new cluster validity measure and its application to image compression, Pattern Analysis and Applications, 7, 205, 10.1007/s10044-004-0218-1
2002
Pal, 1997, Cluster validation using graph theoretic concepts, Pattern Recognition, 30, 847, 10.1016/S0031-3203(96)00127-6
I. Guyon, U. von Luxburg, R.C. Williamson, Clustering: science or art?, in: NIPS 2009 Workshop on Clustering Theory, Vancouver, Canada, 2009.
Brun, 2007, Model-based evaluation of clustering validation measures, Pattern Recognition, 40, 807, 10.1016/j.patcog.2006.06.026
Pfitzner, 2009, Characterization and evaluation of similarity measures for pairs of clusterings, Knowledge and Information Systems, 19, 361, 10.1007/s10115-008-0150-6
Batagelj, 1995, Comparing resemblance measures, Journal of Classification, 12, 73, 10.1007/BF01202268
Dunn, 1973, A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, Journal of Cybernetics, 3, 32, 10.1080/01969727308546046
Davies, 1979, A clustering separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 224, 10.1109/TPAMI.1979.4766909
Calinski, 1974, A dendrite method for cluster analysis, Communications in Statistics, 3, 1, 10.1080/03610927408827101
A. Ben-Hur, A. Elisseeff, I. Guyon, A stability based method for discovering structure in clustered data, in: Biocomputing 2002 Proceedings of the Pacific Symposium, vol. 7, 2002, pp. 6–17.
Jain, 1987, Bootstrap technique in cluster analysis, Pattern Recognition, 20, 547, 10.1016/0031-3203(87)90081-1
Dimitriadou, 2002, An examination of indexes for determining the number of clusters in binary data sets, Psychometrika, 67, 137, 10.1007/BF02294713
Maulik, 2002, Performance evaluation of some clustering algorithms and validity indices, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 1650, 10.1109/TPAMI.2002.1114856
Milligan, 1985, An examination of procedures for determining the number of clusters in a data set, Psychometrika, 50, 159, 10.1007/BF02294245
Halkidi, 2008, A density-based cluster validity approach using multi-representatives, Pattern Recognition Letters, 20, 773, 10.1016/j.patrec.2007.12.011
Hardy, 1996, On the number of clusters, Computational Statistics & Data Analysis, 23, 83, 10.1016/S0167-9473(96)00022-9
Lago-Fernández, 2010, Normality-based validation for crisp clustering, Pattern Recognition, 43, 782, 10.1016/j.patcog.2009.09.018
Žalik, 2011, Validity index for clusters of different sizes and densities, Pattern Recognition Letters, 32, 221, 10.1016/j.patrec.2010.08.007
Kim, 2005, New indices for cluster validity assessment, Pattern Recognition Letters, 26, 2353, 10.1016/j.patrec.2005.04.007
Saha, 2009, Performance evaluation of some symmetry-based cluster validity indexes, IEEE Transactions on Systems, Man, and Cybernetics, Part C, 39, 420, 10.1109/TSMCC.2009.2013335
Dubes, 1987, How many clusters are best? – an experiment, Pattern Recognition, 20, 645, 10.1016/0031-3203(87)90034-3
Gurrutxaga, 2011, Towards a standard methodology to evaluate internal cluster validity indices, Pattern Recognition Letters, 32, 505, 10.1016/j.patrec.2010.11.006
Bezdek, 1997, A geometric approach to cluster validity for normal mixtures, Soft Computing—A Fusion of Foundations, Methodologies and Applications, 1, 166
Bandyopadhyay, 2008, A point symmetry-based clustering technique for automatic evolution of clusters, IEEE Transactions on Knowledge and Data Engineering, 20, 1441, 10.1109/TKDE.2008.79
Kim, 2001, A novel validity index for determination of the optimal number of clusters, IEICE Transactions on Information and Systems, E84-D, 281
Sugar, 2003, Finding the number of clusters in a dataset, Journal of the American Statistical Association, 98, 750, 10.1198/016214503000000666
Baker, 1975, Measuring the power of hierarchical cluster analysis, Journal of the American Statistical Association, 70, 31, 10.1080/01621459.1975.10480256
Hubert, 1976, A general statistical framework for assessing categorical clustering in free recall, Psychological Bulletin, 83, 1072, 10.1037/0033-2909.83.6.1072
Rousseeuw, 1987, Silhouettes, Journal of Computational and Applied Mathematics, 20, 53, 10.1016/0377-0427(87)90125-7
Bezdek, 1998, Some new indexes of cluster validity, IEEE Transactions on Systems, Man, and Cybernetics, Part B, 28, 301, 10.1109/3477.678624
M. Halkidi, M. Vazirgiannis, Clustering validity assessment: finding the optimal partitioning of a data set, in: Proceedings of the First IEEE International Conference on Data Mining (ICDM'01), California, USA, 2001, pp. 187–194.
Saitta, 2007, A bounded index for cluster validity, vol. 4571, 174
Jaccard, 1908, Nouvelles recherches sur la distribution florale, Bulletin de la Societé Vaudoise de Sciences Naturelles, 44, 223
M. Meilă, Comparing clusterings by the variation of information, in: Proceedings of the Sixteenth Annual Conference on Computational Learning Theory (COLT), 2003, pp. 173–187.
A. Frank, A. Asuncion, UCI machine learning repository, 2010.
Dems˘ar, 2006, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1
Dietterich, 1998, Approximate statistical tests for comparing supervised classification learning algorithms, Neural Computation, 10, 1895, 10.1162/089976698300017197
García, 2008, An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, Journal of Machine Learning Research, 9, 2677