LG DS ST MLMar 12, 2025

Batch List-Decodable Linear Regression via Higher Moments

Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Sihan Liu, Thanasis Pittas

arXiv:2503.09802v17.11 citationsh-index: 48ICML

Originality Incremental advance

AI Analysis

This addresses robust regression in adversarial settings for machine learning, offering incremental improvements in theoretical guarantees.

The paper tackles list-decodable linear regression with batches, where only an α-fraction of batches are clean, by proposing a new polynomial-time algorithm that achieves a list size of O(1/α) and error O(α^{-δ/2}/√n) under assumptions of SoS-certifiable bounded higher moments, improving over prior work in batch size and error.

We study the task of list-decodable linear regression using batches. A batch is called clean if it consists of i.i.d. samples from an unknown linear regression distribution. For a parameter $α\in (0, 1/2)$, an unknown $α$-fraction of the batches are clean and no assumptions are made on the remaining ones. The goal is to output a small list of vectors at least one of which is close to the true regressor vector in $\ell_2$-norm. [DJKS23] gave an efficient algorithm, under natural distributional assumptions, with the following guarantee. Assuming that the batch size $n$ satisfies $n \geq \tildeΩ(α^{-1})$ and the number of batches is $m = \mathrm{poly}(d, n, 1/α)$, their algorithm runs in polynomial time and outputs a list of $O(1/α^2)$ vectors at least one of which is $\tilde{O}(α^{-1/2}/\sqrt{n})$ close to the target regressor. Here we design a new polynomial time algorithm with significantly stronger guarantees under the assumption that the low-degree moments of the covariates distribution are Sum-of-Squares (SoS) certifiably bounded. Specifically, for any constant $δ>0$, as long as the batch size is $n \geq Ω_δ(α^{-δ})$ and the degree-$Θ(1/δ)$ moments of the covariates are SoS certifiably bounded, our algorithm uses $m = \mathrm{poly}((dn)^{1/δ}, 1/α)$ batches, runs in polynomial-time, and outputs an $O(1/α)$-sized list of vectors one of which is $O(α^{-δ/2}/\sqrt{n})$ close to the target. That is, our algorithm achieves substantially smaller minimum batch size and final error, while achieving the optimal list size. Our approach uses higher-order moment information by carefully combining the SoS paradigm interleaved with an iterative method and a novel list pruning procedure. In the process, we give an SoS proof of the Marcinkiewicz-Zygmund inequality that may be of broader applicability.

View on arXiv PDF

Similar