DS LG ST MLMay 26, 2023

Feature Adaptation for Sparse Linear Regression

Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi

arXiv:2305.16892v17.39 citations

Originality Incremental advance

AI Analysis

This work addresses a computational bottleneck in high-dimensional statistics for researchers and practitioners dealing with sparse data and correlated features, offering incremental improvements over classical methods.

The paper tackles the problem of sparse linear regression with correlated covariates, where existing efficient algorithms like Lasso require more samples than information-theoretically necessary due to ill-conditioned covariance matrices. It provides a polynomial-time algorithm that adapts the Lasso to tolerate approximate dependencies, achieving near-optimal sample complexity for constant sparsity and few outlier eigenvalues, and offers the first polynomial-factor improvement over brute-force search for arbitrary covariance.

Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,Σ)$, and we seek an estimator with small excess risk. If the true signal is $t$-sparse, information-theoretically, it is possible to achieve strong recovery guarantees with only $O(t\log n)$ samples. However, computationally efficient algorithms have sample complexity linear in (some variant of) the condition number of $Σ$. Classical algorithms such as the Lasso can require significantly more samples than necessary even if there is only a single sparse approximate dependency among the covariates. We provide a polynomial-time algorithm that, given $Σ$, automatically adapts the Lasso to tolerate a small number of approximate dependencies. In particular, we achieve near-optimal sample complexity for constant sparsity and if $Σ$ has few ``outlier'' eigenvalues. Our algorithm fits into a broader framework of feature adaptation for sparse linear regression with ill-conditioned covariates. With this framework, we additionally provide the first polynomial-factor improvement over brute-force search for constant sparsity $t$ and arbitrary covariance $Σ$.

View on arXiv PDF

Similar