Output-Constrained Decision Trees
This work addresses the need for feasible predictions in real-world applications by extending decision trees to handle constraints, though it is incremental as it builds on existing tree methods.
The paper tackled the problem of incorporating domain-specific constraints into decision trees for multi-target regression, introducing three new methods (M-OCRT, E-OCRT, EP-OCRT) and a random forest framework, and demonstrated that these approaches produce accurate and feasible predictions in computational studies on synthetic and industry datasets.
Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.