An examination of procedures for determining the number of clusters in a data set

Psychometrika - 1985

Glenn W. Milligan¹, Martha Cooper¹

¹Faculty of Management Sciences, The Ohio State University, Columbus

Tóm tắt

Từ khóa

Tài liệu tham khảo

Andrews, D. F. (1972). Plots of high-dimensional data.Biometrics, 28, 125–136.

Arnold, S. J. (1979). A test for clusters.Journal of Marketing Research, 19, 545–551.

Baker, F. B., & Hubert, L. J. (1975). Measuring the power of hierarchical cluster analysis.Journal of the American Statistical Association, 70, 31–38.

Ball, G. H., & Hall, D. J. (1965).ISODATA, A novel method of data analysis and pattern classification. Menlo Park: Stanford Research Institute. (NTIS No. AD 699616).

Beale, E. M. L. (1969).Cluster analysis. London: Scientific Control Systems.

Binder, D. A. (1978). Bayesian cluster analysis.Biometrika, 65, 31–38.

Blashfield, R. K., & Morey, L. C. (1980). A comparison of four clustering methods using MMPI Monte Carlo data.Applied Psychological Measurement, 4, 57–64.

Bock, H. H. (1977). On tests concerning the existence of a classification. InFirst international symposium on data analysis and informatics (Vol. 2, pp. 449–464). Rocquencourt, France: IRIA.

Calinski, R. B., & Harabasz, J. (1974). A dendrite method for cluster analysis.Communications in Statistics, 3, 1–27.

Cohen, A. C. (1967). Estimation in mixtures of two normal distributions.Technometrics, 9, 15–28.

Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure.IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 224–227.

Day, N. E. (1969). Estimating the components of a mixture of normal distributions.Biometrika, 56, 463–474.

Dubes, R., & Jain, A. K. (1979). Validity studies in clustering methodologies.Pattern Recognition, 11, 235–254.

Duda, R. O., & Hart, P. E. (1973).Pattern classification and scene analysis. New York: Wiley.

Edwards, A. W. F., & Cavalli-Sforza, L. (1965). A method for cluster analysis.Biometrics, 21, 362–375.

Englemann, L., & Hartigan, J. A. (1969). Percentage points of a test for clusters.Journal of the American Statistical Association, 64, 1647–1648.

Everitt, B. S. (1979). Unresolved problems in cluster analysis.Biometrics, 35, 169–181.

Everitt, B. S. (1981). A Monte Carlo investigation in the likelihood ratio test for the number of components in a mixture of normal distributions.Multivariate Behavioral Research, 16, 171–180.

Fleiss, J. L., Lawlor, W., Platman, S. R., & Fieve, R. R. (1971). On the use of inverted factor analysis for generating typologies.Journal of Abnormal Psychology, 77, 127–132.

Fleiss, J. L., & Zubin, J. (1969). On the methods and theory of clustering.Multivariate Behavioral Research, 4, 235–250.

Friedman, H. P., & Rubin, J. (1967). On some invariant criteria for grouping data.Journal of the American Statistical Association, 62, 1159–1178.

Frey, T., & Van Groenewoud, H. (1972). A cluster analysis of the D-squared matrix of white spruce stands in Saskatchewan based on the maximum-minimum principle.Journal of Ecology, 60, 873–886.

Fukunaga, K., & Koontz, W. L. G. (1970). A criterion and an algorithm for grouping data.IEEE Transactions on Computers, C-19, 917–923.

Gengerelli, J. A. (1963). A method for detecting subgroups in a population and specifying their membership list.Journal of Psychology, 5, 457–468.

Gnanadesikan, R., Kettenring, J. R., & Landwehr, J. M. (1977). Interpreting and assessing the results of cluster analyses.Bulletin of the International Statistical Institute, 47, 451–463.

Good, I. J. (1982). An index of separateness of clusters and a permutation test for its statistical significance.Journal of Statistical Computing and Simulation, 15, 81–84.

Goodall, D. W. (1966). Hypothesis testing in classification.Nature, 221, 329–330.

Gower, J. C. (1975). Goodness-of-fit criteria for classification and other patterned structures. In G. Estabrook, (Ed.),Proceedings of the 8th international conference on numerical taxonomy. San Francisco: Freeman.

Gower, J. C. (1981, June).Is classification statistical? Paper presented at the meeting of the Classification Society, Toronto.

Hall, D. J., Duda, R. O., Huffman, D. A., & Wolf, E. E. (1973).Development of new pattern recognition methods. Los Angeles: Aerospace Research Laboratories. (NTIS No. AD 7726141).

Hansen, R. A., & Milligan, G. W. (1981). Objective assessment of cluster analysis output: Theoretical considerations and empirical findings.Proceedings of the American Institute for Decision Sciences, 314–316.

Hartigan, J. A. (1975).Clustering algorithms. New York: Wiley.

Hartigan, J. A. (1977). Distribution problems in clustering. In J. Van Ryzin (Ed.),Classification and clustering. New York: Academic Press.

Hartigan, J. A. (1978). Asymptotic distributions for clustering criteria.Annals of Statistics, 6, 117–131.

Hill, R. S. (1980). A stopping rule for partitioning dendrograms.Botanical Gazette, 141, 321–324.

Hubert, L. J., & Baker, F. B. (1977). The comparison and fitting of given classification schemes.Journal of Mathematical Psychology, 16, 233–253.

Hubert, L. J., & Levin, J. R. (1976). A general statistical framework for assessing categorical clustering in free recall.Psychological Bulletin, 83, 1072–1080.

Jain, A. K., & Waller, W. G. (1978). On the number of features in the classification of multivariate gaussian data.Pattern Recognition, 10, 365–374.

Jancey, R. C. (1966). Multidimensional group analysis.Australian Journal of Botany, 14, 127–130.

Johnson, S. C. (1967). Hierarchical clustering schemes.Psychometrika, 32, 241–254.

Lee, K. L. (1979). Multivariate tests for clusters.Journal of the American Statistical Association, 74, 708–714.

Lingoes, J. C., & Cooper, T. (1971). PEP-I: A FORTRAN IV (G) program for Guttman-Lingoes nonmetric probability clustering.Behaviorial Science, 16, 259–261.

Marriot, F. H. C. (1971). Practical problems in a method of cluster analysis.Biometrics, 27, 501–514.

McClain, J. O., & Rao, V. R. (1975), CLUSTISZ: A program to test for the quality of clustering of a set of objects.Journal of Marketing Research, 12, 456–460.

Milligan, G. W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms.Psychometrika, 45, 325–342.

Milligan, G. W. (1981a). A Monte Carlo study of thirty internal criterion measures for cluster analysis.Psychometrika, 46, 187–199.

Milligan, G. W. (1981b). A review of Monte Carlo tests of cluster analysis.Multivariate Behavioral Research, 16, 379–407.

Milligan, G. W. (1981c, June).A discussion of procedures for determining the number of clusters in a data set. Paper presented at the meeting of the Classification Society, Toronto.

Milligan, G. W. (1983). Characteristics of four external criterion measures. In J. Felsenstein, (Ed.),Proceedings of the 1982 NATO Advanced Studies Institute on Numerical Taxonomy (pp. 167–173). New York: Springer-Verlag.

Milligan, G. W., & Sokol, L. M. (1980). A two-stage clustering algorithm with robust recovery characteristics.Educational and Psychological Measurement, 40, 755–759.

Milligan, G. W., Soon, S. C., & Sokol, L. M. (1983). The effect of cluster size, dimensionality, and the number of clusters on recovery of true cluster structure.IEEE Transactions on Pattern Analysis and Machine Intelligence, 5, 40–47.

Mojena, R. (1977). Hierarchical grouping methods and stopping rules: An evaluation.The Computer Journal, 20, 359–363.

Morey, L., & Agresti, A. (1984). The measurement of classification agreement: An adjustment to the Rand statistic for chance agreement.Educational and Psychological Measurement, 44, 33–37.

Mountford, M. D. (1970). A test for the difference between clusters. In G. P. Patil, E. C. Pielou, & W. E. Waters (Eds.),Statistical Ecology (Vol. 3, pp. 237–257). University Park, Pa.: Pennsylvania State University Press.

Naus, J. I. (1966), A power comparison of two tests of non-random clustering.Technometrics, 8, 493–517.

Orloci, L. (1967). An agglomerative method for classification of plant communities.Journal of Ecology, 55, 193–206.

Perruchet, C. (1983).Les épreuves de classifiabilité en analyses des données [Statistical tests of classificability] (Tech. Rep. NT/PAA/ATR/MTI/810). Issy-Les-Moulineaux, France: C.N.E.T.

Ray, A. A. (Ed.). (1982).SAS user's guide: Statistics. Cary, North Carolina: SAS Institute.

Ratkowsky, D. A., & Lance, G. N. (1978). A criterion for determining the number of groups in a classification.Australian Computer Journal, 10, 115–117.

Rohlf, F. J. (1974). Methods of comparing classifications.Annual Review of Ecology and Systematics, 5, 101–113.

Rubin, J. (1967). Optimal classification into groups: An approach for solving the taxonomy problem.Journal of Theoretical Biology, 15, 103–144.

Sarle, W. S. (1983).Cubic clustering criterion (Tech. Rep. A-108). Cary, N.C.: SAS Institute.

Scott, A. J., & Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria.Biometrics, 27, 387–397.

Sneath, P. H. A. (1977). A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap.Mathematical Geology, 9, 123–143.

Sneath, P. H. A., & Sokal, R. R. (1973).Numerical taxonomy. San Francisco: Freeman.

Sokal, R. R., & Sneath, P. H. A. (1963).Principles of numerical taxonomy. San Francisco: Freeman.

Thorndike, R. L. (1953). Who belongs in a family?Psychometrika, 18, 267–276.

Wolfe, J. H. (1970). Pattern clustering by multivariate mixture analysis.Multivariate Behavioral Research, 5, 329–350.

Wong, M. A. (1982). A hybrid clustering method for identifying high-density clusters.Journal of the American Statistical Association, 77, 841–847.

Wong, M. A., & Schaak, C. (1982). Using the Kth nearest neighbor clustering procedure to determine the number of subpopulations.Proceedings of the Statistical Computing Section, American Statistical Association, 40–48.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA