Optimal classification trees

Machine Learning - Tập 106 Số 7 - Trang 1039-1082 - 2017
Dimitris Bertsimas1, Jack Dunn2
1Operations Research Center, Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
2Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Arthanari, T., & Dodge, Y. (1981). Mathematical programming in statistics (Vol. 341). New York: Wiley.

Auer, P., Holte, R. C., & Maass, W. (1995). Theory and applications of agnostic pac-learning with small decision trees. In Proceedings of the 12th international conference on machine learning (pp. 21–29).

Bennett, K. P. (1992). Decision tree construction via linear programming. In M. Evans (Ed.), Proceedings of the 4th midwest artificial intelligence and cognitive science society conference (pp. 97–101).

Bennett, K. P., & Blue, J. (1996). Optimal decision trees. Rensselaer Polytechnic Institute Math Report No. 214.

Bennett, K. P., & Blue, J. A. (1998). A support vector machine approach to decision trees. In IEEE international joint conference on neural networks proceedings. IEEE world congress on computational intelligence (Vol. 3, pp. 2396–2401).

Bertsimas, D., & King, A. (2015). An algorithmic approach to linear regression. Operations Research, 64(1), 2–16.

Bertsimas, D., & King, A. (2017). Logistic regression: From art to science. Statistical Science (to appear).

Bertsimas, D., & Mazumder, R. (2014). Least quantile regression via modern optimization. The Annals of Statistics, 42(6), 2494–2525.

Bertsimas, D., & Shioda, R. (2007). Classification and regression via integer optimization. Operations Research, 55(2), 252–271.

Bertsimas, D., & Weismantel, R. (2005). Optimization over integers. Belmont, MA: Dynamic Ideas.

Bertsimas, D., King, A., & Mazumder, R. (2016). Best subset selection via a modern optimization lens. Annals of Statistics, 44(2), 813–852.

Bezanson, J., Edelman, A., Karpinski, S., & Shah, V. B. (2014). Julia: A fresh approach to numerical computing. arXiv preprint arXiv:1411.1607

Bixby, R. E. (2012). A brief history of linear and mixed-integer programming computation. Documenta Mathematica, Extra Volume: Optimization Stories, 107–121.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Monterey, CA: Wadsworth and Brooks.

Cox, L. A, Jr., Yuping, Q., & Kuehner, W. (1989). Heuristic least-cost computation of discrete classification functions with uncertain argument values. Annals of Operations Research, 21(1), 1–29.

Esmeir, S., & Markovitch, S. (2007). Anytime learning of decision trees. The Journal of Machine Learning Research, 8, 891–933.

Gurobi Optimization Inc. (2015a). Gurobi 6.0 performance benchmarks. http://www.gurobi.com/pdfs/benchmarks.pdf . Accessed September 5, 2015.

Gurobi Optimization Inc. (2015b). Gurobi optimizer reference manual. http://www.gurobi.com .

Heath, D., Kasif, S., & Salzberg, S. (1993). Induction of oblique decision trees. In IJCAI, Citeseer (pp. 1002–1007).

Hyafil, L., & Rivest, R. L. (1976). Constructing optimal binary decision trees is np-complete. Information Processing Letters, 5(1), 15–17.

IBM ILOG CPLEX. (2014). V12.1 users manual. https://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/ .

Liaw, A., & Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3), 18–22. http://CRAN.R-project.org/doc/Rnews/ .

Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml .

Loh, W. Y., & Shih, Y. S. (1997). Split selection methods for classification trees. Statistica Sinica, 7(4), 815–840.

López-Chau, A., Cervantes, J., López-García, L., & Lamont, F. G. (2013). Fishers decision tree. Expert Systems with Applications, 40(16), 6283–6291.

Lubin, M., & Dunning, I. (2015). Computing in operations research using julia. INFORMS Journal on Computing, 27(2), 238–248.

Murthy, S., & Salzberg, S. (1995a). Lookahead and pathology in decision tree induction. In IJCAI, Citeseer (pp. 1025–1033).

Murthy, S. K., & Salzberg, S. (1995b). Decision tree induction: How effective is the greedy heuristic? In KDD (pp. 222–227).

Murthy, S. K., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2, 1–32.

Nemhauser, G. L. (2013). Integer programming: The global impact. In Presented at EURO, INFORMS, Rome, Italy, 2013. http://euro-informs2013.org/data/http_/euro2013.org/wp-content/uploads/nemhauser.pdf . Accessed September 9, 2015.

Norouzi, M., Collins, M. D., Johnson, M. A., Fleet, D. J., & Kohli, P. (2015). Efficient non-greedy optimization of decision trees. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett (Eds.), Proceedings of the Advances in Neural Information Processing Systems 2015, 28: Annual Conference on Neural Information Processing Systems, 7–12 December 2015, Montreal, QC, pp. 1729–1737.

Norton, S. W. (1989). Generating better decision trees. In IJCAI (Vol. 89, pp. 800–805).

Payne, H. J., & Meisel, W. S. (1977). An algorithm for constructing optimal binary decision trees. IEEE Transactions on Computers, 100(9), 905–916.

Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Francisco, CA: Morgan Kaufmann.

R Core Team. (2015). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org/ .

Son, N. H. (1998). From optimal hyperplanes to optimal decision trees. Fundamenta Informaticae, 34(1, 2), 145–174.

Therneau, T., Atkinson, B., & Ripley, B. (2015). rpart: Recursive partitioning and regression trees. http://CRAN.R-project.org/package=rpart , R package version 4.1-9.

Tjortjis, C., & Keane, J. (2002). T3: A classification algorithm for data mining. Lecture Notes in Computer Science (Vol. 2412, pp. 50–55). Berlin: Springer.

Top500 Supercomputer Sites. (2015). Performance development. http://www.top500.org/statistics/perfdevel/ . Accessed September 4, 2015.

Truong, A. (2009). Fast growing and interpretable oblique trees via logistic regression models. Ph.D. thesis, University of Oxford.

Tzirakis, P., & Tjortjis, C. (2016). T3c: Improving a decision tree classification algorithms interval splits on continuous attributes. Advances in Data Analysis and Classification, 1–18.

Wickramarachchi, D., Robertson, B., Reale, M., Price, C., & Brown, J. (2016). Hhcart: An oblique decision tree. Computational Statistics & Data Analysis, 96, 12–23.