On the Discrepancy between Kleinberg’s Clustering Axioms and k-Means Clustering Algorithm Behavior
Tóm tắt
This paper performs an investigation of Kleinberg’s axioms (from both an intuitive and formal standpoint) as they relate to the well-known k-mean clustering method. The axioms, as well as a novel variations thereof, are analyzed in Euclidean space. A few natural properties are proposed, resulting in k-means satisfying the intuition behind Kleinberg’s axioms (or, rather, a small, and natural variation on that intuition). In particular, two variations of Kleinberg’s consistency property are proposed, called centric consistency and motion consistency. It is shown that these variations of consistency are satisfied by k-means.
Tài liệu tham khảo
Ackerman, M (2012) Towards theoretical foundations of clustering,. University of Waterloo, PhD Thesis.
Ackerman, M., Ben-David, S., & Loker, D. (2010) Characterization of linkage-based clustering. In COLT 2010 - The 23rd Conference on Learning Theory, Haifa, Israel , 27-29, 2010, (pp. 270–281).
Ackerman, M., Ben-David, S., & Loker, D. (2010) Towards property-based classification of clustering paradigms. In J.D. Lafferty, C.K.I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, (pp 10–18). Curran Associates, Inc.,
Ackerman, M., Ben-David, S., Loker, D., & Sabato, S.(2103) Clustering oligarchies. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, AISTATS , Scottsdale, AZ, USA, April 29 - May 1, 2013, (pp. 66–74).
Ackerman, M., & Dasgupta, S. (2014) Incremental clustering: The case for extra clusters. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, (pp. 307–315) .
Ambroszkiewicz, S., & Koronacki, J. (2010) O pewnym wyniku orzekaja̧cym niemożność przeprowadzenia analizy skupień. krótki dowód z komentarzem. Decision Support Systems Conferernce, Zakopane, Poland, 2010, lecture, .
Arthur, D., & Vassilvitskii, S. (2007) \(k\)-means++: the advantages of careful seeding. In N. Bansal, K. Pruhs, and C. Stein, editors, Proc. of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, (pp. 1027–1035), New Orleans, Louisiana, USA, 7-9 Jan. SIAM.
Ben-David, S. (December 2005) Attempts to axiomatize clustering, 2005. NIPS Workshop,.
Ben-David, S., & Ackerman, M. (2009) Measures of clustering quality: A working set of axioms for clustering. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, (pp 121–128). Curran Associates, Inc.,.
Blum, A. (2009) Thoughts on clustering . essay for the 2009 nips workshop "clustering: Science or art?"
Carlsson, G., & Mémoli, F. (2008) Persistent clustering and a theorem of j. kleinberg. arXiv preprint arXiv:0808.2241,.
Carlsson, G., & Mémoli, F. (2010). Characterization, stability and convergence of hierarchical clustering methods. Journal of Machine Learning Research, 11, 1425–1470.
Duda, R. O., Hart, P. E., & Stork, G. (2000). Pattern Classification (2nd ed.). New York: J. Wiley & Sons.
Hopcroft, J., & Kannan, R. (2012) Computer science theory for the information age, chapter 8.13.2. A Satisfiable Set of Axioms. page 272ff.
Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & A. Y. Wu. A (2002) local search approximation algorithm for k-means clustering. In Proceedings of the 18th Annual Symposium on Computational Geometry, Barcelona, Spain, June 5-7, 2002, (pp. 10–18)
Kleinberg, J. (2002) An impossibility theorem for clustering. In Proc. NIPS 2002, pages 446–453,http://books.nips.cc/papers/files/nips15/LT17.pdf.
Klopotek, M., & Klopotek, R. (2021) Issues in clustering algorithm consistency in fixed dimensional spaces. some solutions for k-means. to appear in Journal of Intelligent Information Systems,.
Klopotek, M. A. (2019) A note on k-means probabilistic poverty. CoRR, abs/1910.00413,.
Klopotek, M. A. (2020). An aposteriorical clusterability criterion for k-means++ and simplicity of clustering. SN Computer Science, 1(2), 80.
Klopotek, M. A., & Klopotek, R. A. (2020) In-the-limit clustering axioms. In Artificial Intelligence and Soft Computing, volume 12416 of LNCS, (pp. 199–209). Springer,.
Klopotek, M. A., Wierzchon, S. T., & Klopotek, R. A. (2020) k-means cluster shape implications. In Artificial Intelligence Applications and Innovations, volume 583 of IFIP Advances in Information and Communication Technology, (pp. 107–118). Springer,.
Kłopotek, M.A. (1991) On the phenomenon of flattening ’flexible prediction’ concept hierarchy. In Fundamentals of Artificial Intelligence Research. International Workshop, volume 535 of LNAI, (pp. 99–111). Springer
Klopotek, R. A., & Klopotek, M. A. (2019) robabilistic k-richness of the k-means algorithms. In Machine Learning, Optimization, and Data Science, volume 11943 of LNCS, (pp. 259–271). Springer
Meilǎ, M. (2005) Comparing clusterings: An axiomatic view. In Proceedings of the 22Nd International Conference on Machine Learning, ICML ’05, pages 577–584, New York, NY, USA, ACM.
Ostrovsky, R., Rabani, Y., Schulman, L. J., & Swamy, C. (January 2013) The effectiveness of lloyd-type methods for the k-means problem. J. ACM, 59(6):28:1–28:22. 0.0000001 is epsilon so that epsilon square<target kmeans for k/target kmeans for k-1.
Song, M., & Rajasekaran, S. (2010). Fast algorithms for constant approximation k-means clustering. Transaction MLDM, 3(2), 67–79.
Steinbach, M., Karypis, G., & Kumar, V. (2000) A comparison of document clustering techniques. In Proceedings of KDD Workshop on Text Mining, Proceedings of the 6th International Confer1193 ence on Knowledge discovery and Data Mining, Boston,MA, (2000).
van Laarhoven, T., & Marchiori, E. (2014). Axioms for graph clustering quality functions. Journal of Machine Learning Research, 15, 193–215.
von Luxburg, U. (2009). Clustering stability: An overview. Foundations and Trends in Machine Learning, 2(3), 235–274.
von Luxburg, U., Williamson, R. C., & Guyon, I. (2011) Clustering: Science or art? initially, opinion paper for the NIPS Workshop "Clustering: Science or Art" in 2009,
Wright, W. E. (1973). A formalization of cluster analysis. Pattern Recognition, 5(3), 273–282.
Zadeh, R. B., & Ben-David, S. (2009) A uniqueness theorem for clustering. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI ’09, (pp. 639–646), Arlington, Virginia, United States. AUAI Press.