An Empirical Comparison of Pruning Methods for Decision Tree Induction

Machine Learning - Tập 4 - Trang 227-243 - 1989

John Mingers¹

¹School of Industrial and Business Studies, University of Warwick, Coventry, England

Tóm tắt

This paper compares five methods for pruning decision trees, developed from sets of examples. When used with uncertain rather than deterministic data, decision-tree induction involves three main stages—creating a complete tree able to classify all the training examples, pruning this tree to give statistical reliability, and processing the pruned tree to improve understandability. This paper concerns the second stage—pruning. It presents empirical comparisons of the five methods across several domains. The results show that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems. They also show that there is no significant interaction between the creation and pruning methods.

Tài liệu tham khảo

Bratko, I., and Konenko, I. (1986). Learning diagnostic rules from incomplete and noisy data. Seminar on AI methods in statistics. London Business School, England: Unicom Seminars Ltd. Bratko, I., and Lavrac, N. (Eds.) (1987). Pregress in machine learning. England: Sigma Press. Breiman, L., Freidman, J., Olshen, R., and Stone, C. (1984). Classification and regression trees. California: Wadsworth International. Cestnik, G., Kononenko, I., and Bratko, I. (1987). ASSISTANT 86: A knowledge elicitation tool for sophisticated users. In I. Bratko and N. Lavrac (Eds.), Progress in machine learning. England: Sigma Press. Cochran, W. (1947). Some consequences when the assumptions for the Analysis of Variance are not satisfied.Biometrica 3, 22–38. Hart, A. (1985a). The role of induction in knowledge elicitation. Expert Systems, 2, 24–28. Hart, A. (1986). Knowledge acquisition for expert systems. London: Kogan Page. Hunt, E., Marin, J., and Stone, P. (1966). Experiments in induction. New York: Academic Press. Kendall, M., and Stewart, A. (1976). The advanced theory of statistics (Vol. 3). London: Griffen. Kodratoff, Y., and Manago, M. (1987). Generalization and noise. International Journal of Man-Machine Studies 27, 181–204. Konenko, I., Bratko, I., and Roskar, E. (1984). Experiments in automatic learning of medical diagnostic rules. (Technical report). Ljubljana, Yugoslavia: Jozef Stefan Institute. Marshall, R. (1986). Partitioning methods for classification and decision making in medicine. Statistics in Medicine, 5, 517–526. Michalski, R. S., and Chilausky, C. (1980). Learning by being told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis. International Journal of Policy Analysis and Information Systems, 4, 125–161. Michalski, R. S., Carbonell, J., and Mitchell, T. (1983). Machine learning: An artificial intelligence approach. (Vol. 1). Los Altos: Morgan Kaurman. Michalski, R. S., Carbonell, J., and Mitchell, T. (1983). Machine learning: An artificial intelligence approach. (Vol. 2). Los Altos: Morgan Kaufman. Mingers, J. (1987a). Expert systems—rule induction with statistical data. Journal of the Operational Research Society, 38, 39–47. Mingers, J. (1987b). Rule induction with statistical data—a comparison with multiple regression. Journal of the Operational Research Society, 38, 347–352. Mingers, J. (1989). An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3, 319–342. Niblett, T. Constructing decision trees in noisy domains. In I. Bratko and N. Lavrac (Eds.), Progress in machine learning. England: Sigma Press. Quinlan, J. R. (1979). Discovering rules from large collections of examples: A case study. In D. Michie (Ed.), Expert systems in the micro electronic age. Edinburgh: Edinburgh University Press. Quinlan, J. R. (1983). The effect of noise on concept learning. In R. S. Michalski, J. Carbonell, T. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Los Altos: Morgan Kaufman. Quinlan, J. R. (1983). Learning efficient classification procedures and their application to chess and games. In R. S. Michalski, J. Carbonell, T. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Los Altos: Morgan Kaufman. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81–106. Quinlan, J. R. (1987b). Simplifying decision trees. International Journal of Man-Machine Studies, 27, 221–234.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA