Time and space complexity of deterministic and nondeterministic decision trees

Springer Science and Business Media LLC - Tập 91 Số 1 - Trang 45-74 - 2023
Mikhail Moshkov1
1Computer, Electrical and Mathematical Sciences and Engineering Division and Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia

Tóm tắt

AbstractIn this paper, we study arbitrary infinite binary information systems each of which consists of an infinite set called universe and an infinite set of two-valued functions (attributes) defined on the universe. We consider the notion of a problem over information system, which is described by a finite number of attributes and a mapping associating a decision to each tuple of attribute values. As algorithms for problem solving, we use deterministic and nondeterministic decision trees. As time and space complexity, we study the depth and the number of nodes in the decision trees. In the worst case, with the growth of the number of attributes in the problem description, (i) the minimum depth of deterministic decision trees grows either almost as logarithm or linearly, (ii) the minimum depth of nondeterministic decision trees either is bounded from above by a constant or grows linearly, (iii) the minimum number of nodes in deterministic decision trees has either polynomial or exponential growth, and (iv) the minimum number of nodes in nondeterministic decision trees has either polynomial or exponential growth. Based on these results, we divide the set of all infinite binary information systems into five complexity classes, and study for each class issues related to time-space trade-off for decision trees.

Từ khóa


Tài liệu tham khảo

AbouEisha, H., Amin, T., Chikalov, I., Hussain, S., Moshkov, M.: Extensions of Dynamic Programming for Combinatorial Optimization and Data Mining. Intelligent Systems Reference Library, vol. 146. Springer (2019)

Alsolami, F., Azad, M., Chikalov, I., Moshkov, M.: Decision and Inhibitory Trees and Rules for Decision Tables with Many-valued Decisions. Intelligent Systems Reference Library, vol. 156. Springer (2020)

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)

Moshkov, M.: Time complexity of decision trees. In: Trans. Rough Sets III. Lecture Notes in Computer Science, vol. 3400, pp. 244–459. Springer (2005)

Moshkov, M., Zielosko, B.: Combinatorial Machine Learning - A Rough Set Approach. Studies in Computational Intelligence, vol. 360. Springer (2011)

Rokach, L., Maimon, O.: Data Mining with Decision Trees - Theory and Applications. Series in Machine Perception and Artificial Intelligence, vol. 69. WorldScientific (2007)

Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A.: Logical analysis of numerical data. Math. Program. 79, 163–190 (1997)

Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.B.: An implementation of logical analysis of data. IEEE Trans. Knowl. Data Eng. 12(2), 292–306 (2000)

Chikalov, I., Lozin, V.V., Lozina, I., Moshkov, M., Nguyen, H.S., Skowron, A., Zielosko, B.: Three Approaches to Data Analysis - Test Theory, Rough Sets and Logical Analysis of Data. Intelligent Systems Reference Library, vol. 41. Springer (2013)

Fürnkranz, J., Gamberger, D., Lavrac, N.: Foundations of Rule Learning. Cognitive Technologies. Springer (2012)

Lejeune, M.A., Lozin, V.V., Lozina, I., Ragab, A., Yacout, S.: Recent advances in the theory and practice of logical analysis of data. Eur. J. Oper. Res. 275(1), 1–15 (2019)

Moshkov, M., Piliszczuk, M., Zielosko, B.: Partial Covers, Reducts and Decision Rules in Rough Sets - Theory and Applications. Studies in Computational Intelligence, vol. 145. Springer (2008)

Pawlak, Z.: Rough Sets - Theoretical Aspects of Reasoning About Data. Theory and Decision Library: Series D, vol. 9. Kluwer (1991)

Pawlak, Z., Skowron, A.: Rudiments of rough sets. Inf. Sci. 177 (1), 3–27 (2007)

Abdelhalim, A., Traoré, I., Sayed, B.: RBDT-1: A new rule-based decision tree generation technique. In: Governatori, G, Hall, J., Paschke, A. (eds.) Rule Interchange and Applications, International Symposium, RuleML 2009, Las Vegas, Nevada, USA, November 5-7, 2009. Lecture Notes in Computer Science, vol. 5858, pp. 108–121. Springer (2009)

Abdelhalim, A., Traoré, I., Nakkabi, Y.: Creating decision trees from rules using RBDT-1. Comput. Intell. 32(2), 216–239 (2016)

Aglin, G., Nijssen, S., Schaus, P.: Learning optimal decision trees using caching branch-and-bound search. In: 34th AAAI Conference on Artificial Intelligence, AAAI 2020, pp. 3146–3153 (2020)

Avellaneda, F.: Efficient inference of optimal decision trees. In: 34th AAAI Conference on Artificial Intelligence, AAAI 2020, pp. 3195–3202 (2020)

Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106(7), 1039–1082 (2017)

Demirovic, E., Lukina, A., Hebrard, E., Chan, J., Bailey, J., Leckie, C., Ramamohanarao, K., Stuckey, P.J.: Murtree: Optimal decision trees via dynamic programming and search. J. Mach. Learn. Res. 23, 26–12647 (2022)

Przybyla-Kasperek, M., Aning, S.: Bagging and single decision tree approaches to dispersed data. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) Computational Science - ICCS 2021 - 21st International Conference, Krakow, Poland, June 16-18, 2021, Proceedings, Part III. Lecture Notes in Computer Science, vol. 12744, pp. 420–427. Springer (2021)

Kaufman, K.A., Michalski, R.S., Pietrzykowski, J., Wojtusiak, J.: An integrated multi-task inductive database VINLEN: initial implementation and early results. In: Dzeroski, S., Struyf, J. (eds.) Knowledge Discovery in Inductive Databases, 5th International Workshop, KDID 2006, Berlin, Germany, September 18, 2006, Revised Selected and Invited Papers. Lecture Notes in Computer Science, vol. 4747, pp. 116–133. Springer (2006)

Narodytska, N., Ignatiev, A., Pereira, F., Marques-Silva, J.: Learning optimal decision trees with SAT. In: 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 1362–1368 (2018)

Verwer, S., Zhang, Y.: Learning optimal classification trees using a binary linear program formulation. In: 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, pp. 1625–1632 (2019)

Zabinski, K., Zielosko, B.: Decision rules construction: Algorithm based on EAV model. Entropy 23(1), 14 (2021)

Zielosko, B., Zabinski, K.: Selected approaches for decision rules construction-comparative study. In: Knowledge-Based and Intelligent Information & Engineering Systems: 25th International Conference KES-2021, Virtual Event / Szczecin, Poland, 8-10 September 2021. Procedia Computer Science, vol. 192, pp. 3667–3676. Elsevier (2021)

Molnar, C.: Interpretable Machine Learning. A Guide for Making Black Box Models Explainable, 2nd edn. christophm.github.io/interpretable-ml-book/ (2022)

Wegener, I.: Time-space trade-offs for branching programs. J. Comput. Syst. Sci. 32(1), 91–96 (1986)

Beame, P., Jayram, T.S., Saks, M.E.: Time-space tradeoffs for branching programs. J. Comput. Syst. Sci. 63(4), 542–572 (2001)

Chikalov, I., Hussain, S., Moshkov, M.: Totally optimal decision trees for Boolean functions. Discret. Appl. Math. 215, 1–13 (2016)

Dobkin, D.P., Lipton, R.J.: A lower bound of the (1/2)n2 on linear search programs for the knapsack problem. J. Comput. Syst. Sci. 16(3), 413–417 (1978)

Dobkin, D.P., Lipton, R.J.: On the complexity of computations under varying sets of primitives. J. Comput. Syst. Sci. 18(1), 86–91 (1979)

Morávek, J.: A localization problem in geometry and complexity of discrete programming. Kybernetika 8(6), 498–516 (1972)

Ben-Or, M.: Lower bounds for algebraic computation trees (preliminary report). In: 15th Annual ACM Symposium on Theory of Computing, STOC 1983, pp. 80–86 (1983)

Gabrielov, A., Vorobjov, N.: On topological lower bounds for algebraic computation trees. Found. Comput. Math. 17(1), 61–72 (2017)

Grigoriev, D., Karpinski, M., Vorobjov, N.: Improved lower bound on testing membership to a polyhedron by algebraic decision trees. In: 36th Annual Symposium on Foundations of Computer Science, FOCS 1995, pp. 258–265 (1995)

Grigoriev, D., Karpinski, M., Yao, A.C.: An exponential lower bound on the size of algebraic decision trees for Max. Comput. Complex. 7(3), 193–203 (1998)

Steele, J.M., Yao, A.C.: Lower bounds for algebraic decision trees. J. Algorithms 3(1), 1–8 (1982)

Yao, A.C.: Algebraic decision trees and Euler characteristics. In: 33rd Annual Symposium on Foundations of Computer Science, FOCS 1992, pp. 268–277 (1992)

Yao, A.C.: Decision tree complexity and Betti numbers. In: 26th Annual ACM Symposium on Theory of Computing, STOC 1994, pp. 615–624 (1994)

Moshkov, M.: Optimization problems for decision trees. Fundam. Inform. 21(4), 391–401 (1994)

Moshkov, M.: Decision Trees. Theory and Applications (in Russian). Nizhny Novgorod University Publishers (1994)

Moshkov, M.: Two approaches to investigation of deterministic and nondeterministic decision trees complexity. In: 2nd World Conference on the fundamentals of artificial intelligence, WOCFAI 1995, pp. 275–280 (1995)

Moshkov, M.: Comparative analysis of deterministic and nondeterministic decision tree complexity. Global approach Fundam. Inform. 25(2), 201–214 (1996)

Pawlak, Z.: Information systems theoretical foundations. Inf. Syst. 6(3), 205–218 (1981)

Blum, M., Impagliazzo, R.: Generic oracles and oracle classes (extended abstract). In: 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987, pp. 118–126 (1987)

Hartmanis, J., Hemachandra, L.A.: One-way functions, robustness, and the non-isomorphism of NP-complete sets. In: 2nd Annual Conference on Structure in Complexity Theory, Cornell University, Ithaca, New York, USA, June 16-19, 1987 (1987)

Moshkov, M.: About the depth of decision trees computing Boolean functions. Fundam. Inform. 22(3), 203–215 (1995)

Tardos, G.: Query complexity, or why is it difficult to separate NPA ∩ coNPA from PA by random oracles A? Comb. 9(4), 385–392 (1989)

Buhrman, H., de Wolf, R.: Complexity measures and decision tree complexity: A survey. Theor. Comput. Sci. 288(1), 21–43 (2002)

Dobkin, D.P., Lipton, R.J.: Multidimensional searching problems. SIAM J. Comput. 5(2), 181–186 (1976)

Moshkov, M.: On conditional tests. Sov. Phys. Dokl. 27, 528–530 (1982)

Meyer auf der Heide, F.: A polynomial linear search algorithm for the n-dimensional knapsack problem. In: 15th Annual ACM Symposium on Theory of Computing, STOC 1983, pp. 70–79 (1983)

Moshkov, M.: Classification of infinite information systems depending on complexity of decision trees and decision rule systems. Fundam. Inform. 54(4), 345–368 (2003)

Moshkov, M.: Comparative analysis of deterministic and nondeterministic decision tree complexity. Local approach. In: Trans. Rough Sets IV. Lecture Notes in Computer Science, vol. 3700, pp. 125–143. Springer (2005)

Moshkov, M.: Comparative Analysis of Deterministic and Nondeterministic Decision Trees. Intelligent Systems Reference Library, vol. 179. Springer (2020)

Moshkov, M.: On time and space complexity of deterministic and nondeterministic decision trees. In: 8th International Conference Information Processing and Management of Uncertainty in Knowledge-based Systems, IPMU 2000, vol. 3, pp. 1932–1936 (2000)

Naiman, D.Q., Wynn, H.P.: Independence number and the complexity of families of sets. Discr. Math. 154, 203–216 (1996)

Sauer, N.: On the density of families of sets. J. Comb. Theory (A) 13, 145–147 (1972)

Shelah, S.: A combinatorial problem; stability and order for models and theories in infinitary languages. Pacific J. Math. 41, 241–261 (1972)