Cluster validity methods

SIGMOD Record - Tập 31 Số 2 - Trang 40-45 - 2002
Maria Halkidi1, Yannis Batistakis1, Michalis Vazirgiannis1
1Athens University of Economics & Business

Tóm tắt

Clustering is an unsupervised process since there are no predefined classes and no examples that would indicate grouping properties in the data set. The majority of the clustering algorithms behave differently depending on the features of the data set and the initial assumptions for defining groups. Therefore, in most applications the resulting clustering scheme requires some sort of evaluation as regards its validity. Evaluating and assessing the results of a clustering algorithm is the main subject of cluster validity. In this paper we present a review of the clustering validity and methods. More specifically, Part I of the paper discusses the cluster validity approaches based on external and internal criteria.

Từ khóa


Tài liệu tham khảo

Michael J. A., 1996, Sales and Customer Support. John Willey & Sons

10.1016/0167-8655(96)00026-8

Ester M., 1996, Proceedings of 2nd Int. Conf. On Knowledge Discovery and Data Mining, 226

Fayyad M. U. Piatesky-Shapiro G. Smuth P. Uthurusamy R.. Advances in Knowledge Discovery and Data Mining. AAAI Press 1996]] Fayyad M. U. Piatesky-Shapiro G. Smuth P. Uthurusamy R.. Advances in Knowledge Discovery and Data Mining. AAAI Press 1996]]

10.1109/34.192473

10.5555/846218.847264

Han J., 2001, Morgan Kaufmann Publishers

10.1145/331499.331504

MacQueen J. B., 1967, Proceedings of 5th Berkley Symposium on Mathematical Statistics and Probability, 297

10.1016/S0167-8655(97)00168-2

Theodoridis S. Koutroubas K.. Pattern recognition Academic Press 1999.]] Theodoridis S. Koutroubas K.. Pattern recognition Academic Press 1999.]]

10.1109/34.85677