Necessary and Sufficient Conditions for Optimal Decision Trees using Dynamic Programming
This work addresses scalability for decision tree optimization, which is incremental by generalizing previous dynamic programming methods into a more flexible framework.
The paper tackled the scalability issue in globally optimizing decision trees by identifying necessary and sufficient conditions for separability in dynamic programming, enabling a framework that optimizes any combination of separable objectives and constraints, and experiments showed it outperformed general-purpose solvers by a large margin in scalability across five domains.
Global optimization of decision trees has shown to be promising in terms of accuracy, size, and consequently human comprehensibility. However, many of the methods used rely on general-purpose solvers for which scalability remains an issue. Dynamic programming methods have been shown to scale much better because they exploit the tree structure by solving subtrees as independent subproblems. However, this only works when an objective can be optimized separately for subtrees. We explore this relationship in detail and show the necessary and sufficient conditions for such separability and generalize previous dynamic programming approaches into a framework that can optimize any combination of separable objectives and constraints. Experiments on five application domains show the general applicability of this framework, while outperforming the scalability of general-purpose solvers by a large margin.