LGMay 24, 2024

Output-Constrained Decision Trees

arXiv:2405.15314v4h-index: 6
Originality Incremental advance
AI Analysis

This work addresses the need for feasible predictions in real-world applications by extending decision trees to handle constraints, though it is incremental as it builds on existing tree methods.

The paper tackled the problem of incorporating domain-specific constraints into decision trees for multi-target regression, introducing three new methods (M-OCRT, E-OCRT, EP-OCRT) and a random forest framework, and demonstrated that these approaches produce accurate and feasible predictions in computational studies on synthetic and industry datasets.

Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes