Computation of Least Trimmed Squares: A Branch-and-Bound framework with Hyperplane Arrangement Enhancements
For practitioners in robust statistics, this work enables exact robust regression at significantly larger sample sizes in low-dimensional settings, addressing a key scalability bottleneck.
The paper tackles the NP-hard penalized least trimmed squares regression problem, proposing a new mixed-integer optimization formulation and tailored branch-and-bound algorithm that achieves a 1% optimality gap in 1 minute on instances with 5000 samples and 20 features, where existing methods fail within one hour.
We study computational aspects of a key problem in robust statistics -- the penalized least trimmed squares (LTS) regression problem, a robust estimator that mitigates the influence of outliers in data by capping residuals with large magnitudes. Although statistically attractive, penalized LTS is NP-hard, and existing mixed-integer optimization (MIO) formulations scale poorly due to weak relaxations and exponential worst-case complexity in the number of observations. We propose a new MIO formulation that embeds hyperplane arrangement logic into a perspective reformulation, explicitly enforcing structural properties of optimal solutions. We show that, if the number of features is fixed, the resulting branch-and-bound tree is of polynomial size in the sample size. Moreover, we develop a tailored branch-and-bound algorithm that uses first-order methods with dual bounds to solve node relaxations efficiently. Computational experiments on synthetic and real datasets demonstrate substantial improvements over existing MIO approaches: on synthetic instances with 5000 samples and 20 features, our tailored solver reaches a 1% gap in 1 minute while competing approaches fail to do so within one hour. These gains enable exact robust regression at significantly larger sample sizes in low-dimensional settings.