LGAIOct 27, 2025

RS-ORT: A Reduced-Space Branch-and-Bound Algorithm for Optimal Regression Trees

arXiv:2510.23901v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the problem of efficiently learning optimal regression trees for practitioners dealing with large datasets, though it appears incremental as it builds on existing MIP frameworks with specialized optimizations.

The paper tackles the computational intractability of training optimal regression trees on large-scale continuous data by proposing RS-ORT, a reduced-space branch-and-bound algorithm that guarantees convergence and handles datasets with up to 2,000,000 samples in four hours while delivering superior training and testing performance.

Mixed-integer programming (MIP) has emerged as a powerful framework for learning optimal decision trees. Yet, existing MIP approaches for regression tasks are either limited to purely binary features or become computationally intractable when continuous, large-scale data are involved. Naively binarizing continuous features sacrifices global optimality and often yields needlessly deep trees. We recast the optimal regression-tree training as a two-stage optimization problem and propose Reduced-Space Optimal Regression Trees (RS-ORT) - a specialized branch-and-bound (BB) algorithm that branches exclusively on tree-structural variables. This design guarantees the algorithm's convergence and its independence from the number of training samples. Leveraging the model's structure, we introduce several bound tightening techniques - closed-form leaf prediction, empirical threshold discretization, and exact depth-1 subtree parsing - that combine with decomposable upper and lower bounding strategies to accelerate the training. The BB node-wise decomposition enables trivial parallel execution, further alleviating the computational intractability even for million-size datasets. Based on the empirical studies on several regression benchmarks containing both binary and continuous features, RS-ORT also delivers superior training and testing performance than state-of-the-art methods. Notably, on datasets with up to 2,000,000 samples with continuous features, RS-ORT can obtain guaranteed training performance with a simpler tree structure and a better generalization ability in four hours.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes