Large-sample results for optimization-based clustering methods
Tóm tắt
Many common (nonhierarchical) clustering and classification methods are optimization-based methods, in the sense described by Windham (1987) in this Journal. This paper gives some large sample properties for estimates derived by such methods. Under appropriate conditions, such estimates converge with probability one to a limit, and are asymptotically normally distributed around that limiting value. The conditions are satisfied by most of the common examples of optimization-based methods.
Tài liệu tham khảo
BARDWELL, R. A. (1989), “Asymptotic Behavior of Certain Estimators Under Mild Regularity Conditions“, Ph.D. dissertation, Department of Mathematics, University of Colorado at Boulder.
BOCK, H.-H. (1985), “On Some Significance Tests in Cluster Analysis,”Journal of Classification, 2, 77–108.
BOENTE, G., and FRAIMAN, R. (1988), “On the Asymptotic Behavior of General Maximum Likelihood Estimates for the Nonregular Case Under Nonstandard Conditions”,Biometrika, 75, 45–56.
BRYANT, P. G. (1988), “On Characterizing Optimization-Based Clustering Criteria”,Journal of Classification, 5, 81–84.
BRYANT, P. G., and WILLIAMSON, J. A. (1978), “Asymptotic Behaviour of Classification Maximum Likelihood Estimates,”Biometrika, 65, 273–281.
BRYANT, P. G., and WILLIAMSON, J. A. (1984). “The Asymptotic Distribution of Statistics Derived by Maximizing Sums,” Faculty Working Paper Series number UCD-CBA 1984-3, College of Business and Administration, University of Colorado at Denver.
BRYANT, P. G., and WILLIAMSON, J. A. (1986), “Maximum Likelihood and Classification: A Comparison of Three Approaches,” inClassification as a Tool of Research, Eds. W. Gaul and M. Schader, Amsterdam: North-Holland, 35–45.
DANIELS, H. E. (1961) “The Asymptotic Efficiency of a Maximum Likelihood Estimator,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1, Ed. J. Neyman, Berkeley and Los Angeles: University of California Press, 151–163.
DAY, W. H. E., and EDELSBRUNNER, H. (1985) “Investigation of Proportional Link Linkage Clustering Methods,”Journal of Classification, 2, 239–254.
DUPACOVA, J., and Wets, R. (1988), “Asymptotic Behavior of Statistical Estimators and of Optimal Solutions of Stochastic Optimization Problems,”Annals of Statistics, 16, 4, 1517–1549.
FOUTZ, R. V., and SRIVASTAVA, R. C. (1977) “The Performance of the Likelihood Ratio Test When the Model is Incorrect,”Annals of Statistics, 5, 1183–1194.
HARTIGAN, J. A. (1978) “Asymptotic Distributions for Clustering Criteria,”Annals of Statistics, 6, 117–131.
HUBER, P. J. (1967), “The Behavior of Maximum Likelihood Estimates under Non-standard Conditions”, inProceedings, Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, Eds. L. M. Le Cam and J. Neyman, Berkeley and Los Angeles: University of California Press, 221–233.
MARRIOTT, F. H. C. (1975) “Separating Mixtures of Normal Dsitributions,”Biometrics, 31, 767–769.
MARRIOTT, F. H. C. (1982), “Optimization Methods of Cluster Analysis,”Biometrika, 69, 417–421.
POLLARD, D. (1981) “Strong Consistency of k-means Clustering,”Annals of Statistics, 9, 135–140.
POLLARD, D. (1982) “A Central Limit Theorem for k-means Clustering,”Annals of Probability, 10, 919–926.
SCOTT, A. J., and SYMONS, M. J. (1971) “Clustering Methods Based on Likelihood Ratio Criteria,”Biometrics, 27, 387–397.
SYMONS, M. J. (1981) “Clustering Criteria and Multivariate Normal Mixtures,”Biometrics, 37, 35–43.
WHITE, H. (1982) “Maximum Likelihood Estimation of Misspecified Models,”Econometrica,50, 1–25.
WILLIAMSON, J. A. (1984) “A Note on the Proof by H. E. Daniels of the Asymptotic Efficiency of a Maximum Likelihood Estimator,”Biometrika, 71, 651–653.
WINDHAM, M. P. (1986), “A Unification of Optimization-Based Numerical Classification Algorithms,” inClassification as a Tool of Research, Eds. W. Gaul and M. Schader, Amsterdam: North-Holland, 447–452.
WINDHAM, M. P. (1987) “Parameter Modification for Clustering Criteria”,Journal of Classification, 4, 191–214.
WINDHAM, M. P. (1989), “Statistical Models in Cluster Analysis,” Utah State University, Department of Mathematics and Statistics Research Report May/1989/45.