Feature selection for text classification with Naïve Bayes

Expert Systems with Applications - Tập 36 - Trang 5432-5435 - 2009
Jingnian Chen1,2, Houkuan Huang1, Shengfeng Tian1, Youli Qu1
1School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
2Department of Information and Computing Science, Shandong University of Finance, Jinan, Shandong 250014, China

Tài liệu tham khảo

Forman, 2003, An extensive empirical study of feature selection metrics for text classification, Journal of Machine Learning Research, 3, 1289, 10.1162/153244303322753670

John, 1994, Irrelevant Features and the Subset Selection Problem, 121

Kim, 2006, Some effective techniques for Naive Bayes text classification, IEEE Transactions on Knowledge and Data Engineering, 18, 1457, 10.1109/TKDE.2006.180

Lewis, 1998, Naive Bayes at forty: The independence assumption in information retrieval, 4

Lewis, D.D., & Ringuette, M. (1994). Comparison of two learning algorithms for text categorization. In Proceedings of the third annual symposium on document analysis and information retrieval (pp. 81-93). Las Vegas, NV.

Mladenic, 2003, Feature selection on hierarchy of web documents, Decision Support Systems, 35, 45, 10.1016/S0167-9236(02)00097-0

Shang, 2007, A novel feature selection algorithm for text categorization, Expert System with Applications, 33, 1, 10.1016/j.eswa.2006.04.001

Tan, 2005, Neighbor-weighted K-nearest neighbor for unbalanced text corpus, Expert System with Applications, 28, 667, 10.1016/j.eswa.2004.12.023

Wiener, E. D., Pedersen, J. O., & Weigend, A. S. (1995). A neural network approach to topic spotting. In Proceedings of SDAIR-95, 4th annual symposium on document analysis and information retrieval (pp. 317–332).

Yang, 1997, An evaluation of statistical approaches to text categorization, Information Retrieval, 1, 76

Yang, 1994, An example-based mapping method for text categorization and retrieval, ACM Transactions on Information System, 12, 252, 10.1145/183422.183424