ME MLJun 14, 2021

Robust Inference for High-Dimensional Linear Models via Residual Randomization

Y. Samuel Wang, Si Kai Lee, Panos Toulis, Mladen Kolar

arXiv:2106.07717v23.35 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of unreliable statistical inference in high-dimensional data for researchers and practitioners, offering a more robust solution in realistic scenarios with heavy-tailed distributions or small samples, though it is incremental by extending existing methods to handle overlooked conditions.

The authors tackled robust inference for high-dimensional linear models with Lasso, developing a residual randomization method that works under heavy-tailed covariates and errors, and clustered errors, outperforming state-of-the-art methods in challenging settings while remaining competitive in standard ones.

We propose a residual randomization procedure designed for robust Lasso-based inference in the high-dimensional setting. Compared to earlier work that focuses on sub-Gaussian errors, the proposed procedure is designed to work robustly in settings that also include heavy-tailed covariates and errors. Moreover, our procedure can be valid under clustered errors, which is important in practice, but has been largely overlooked by earlier work. Through extensive simulations, we illustrate our method's wider range of applicability as suggested by theory. In particular, we show that our method outperforms state-of-art methods in challenging, yet more realistic, settings where the distribution of covariates is heavy-tailed or the sample size is small, while it remains competitive in standard, "well behaved" settings previously studied in the literature.

View on arXiv PDF Code

Similar