A comparison of extrinsic clustering evaluation metrics based on formal constraints
Tóm tắt
Từ khóa
Tài liệu tham khảo
Artiles, J., Gonzalo, J., & Sekine, S. (2007). The Semeval-2007 Weps evaluation: Establishing a benchmark for the web people search task. In Proceedings of the 4th International Workshop on Semantic Evaluations (Semeval-2007), June 23–24 (pp. 64–69). Prague.
Bagga, A., & Baldwin, B. (1998). Entity-based cross-document coreferencing using the vector space model. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING-ACL’98) (pp. 79–85). Montreal.
Bakus, J., Hussin, M. F., & Kamel, M. (2002). A SOM-based document clustering using phrases. In Proceedings of the 9th International Conference on Neural Information Procesing (ICONIP’02) (pp. 2212–2216). Singapore.
Dom, B. (2001). An information-theoretic external cluster-validity measure. IBM Research Report.
Ghosh, J. (2003). Scalable clustering methods for data mining. In N. Ye (Ed.), Handbook of data mining. NJ: Lawrence Erlbaum.
Gonzalo, J., & Peters, C. (2005). The impact of evaluation on multilingual text retrieval. In Proceedings of SIGIR 2005 (pp. 603–604). Salvador de Bahia.
Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17(2–3), 107–145.
Larsen, B., & Aone, C. (1999). Fast and effective text mining using linear-time document clustering. In Knowledge Discovery and Data Mining (pp. 16–22). San Diego, CA.
Meila, M. (2003). Comparing clusterings. In Proceedings of COLT 03. Washington, DC.
Pantel, P., & Lin, D. (2002). Efficiently clustering documents with committees. In Proceedings of the PRICAI 2002 7th Pacific Rim International Conference on Artificial Intelligence (pp. 18–22). Tokyo, Japan.
Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (pp. 410–420). Prague.
Steinbach, M., Karypis, G., & Kumar, V. (2000). A comparison of document clustering techniques, KDD 2000 (pp. 109–110). Boston, MA.
Strehl, A. (2002). Relationship-based clustering and cluster ensembles for high-dimensional data mining. PhD thesis, The University of Texas at Austin.
Xu, W., Liu, X., & Gong, Y. (2003). Document clustering based on non-negative matrix factorization. In SIGIR ’03: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 267–273). NY: ACM Press.
Zhao, Y., & Karypis, G. (2001). Criterion functions for document clustering: Experiments and analysis. Technical Report TR 01-40. Department of Computer Science, University of Minnesota, Minneapolis, MN.