Robust Regression over Averaged Uncertainty
This provides a theoretical link between robust optimization and regression methods, with incremental improvements for regression problems under uncertainty.
The paper tackles robust regression by integrating all realizations of the uncertainty set and taking an averaged approach, showing it recovers ridge regression exactly and links robust optimization to mean squared error methods. It demonstrates consistent improvements in out-of-sample performance over worst-case formulations on synthetic and UCI datasets.
We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least squares regression problem. We show that this formulation recovers ridge regression exactly and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We further demonstrate that the condition of this equivalence relies on the geometric properties of the defined uncertainty set. We provide exact, closed-form, in some cases, analytical solutions to the equivalent regularization strength under uncertainty sets induced by $\ell_p$ norm, Schatten $p$-norm, and general polytopes. We then show in synthetic datasets with different levels of uncertainties, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance. In real-world regression problems obtained from UCI datasets, similar improvements are seen in the out-of-sample datasets.