Efficient List-Decodable Regression using Batches
This addresses robust regression in adversarial data scenarios, providing a first polynomial-time solution for list-decodable regression by leveraging batch structure, which is incremental due to building on prior lower bounds.
The paper tackles the problem of list-decodable linear regression in a batch setting where only a fraction of batches are genuine, by developing a polynomial-time algorithm that returns a list containing an estimate close to the true parameter, requiring only a near-linear number of genuine batches in dimension and achieving a list size of O(1/α²).
We begin the study of list-decodable linear regression using batches. In this setting only an $α\in (0,1]$ fraction of the batches are genuine. Each genuine batch contains $\ge n$ i.i.d. samples from a common unknown distribution and the remaining batches may contain arbitrary or even adversarial samples. We derive a polynomial time algorithm that for any $n\ge \tilde Ω(1/α)$ returns a list of size $\mathcal O(1/α^2)$ such that one of the items in the list is close to the true regression parameter. The algorithm requires only $\tilde{\mathcal{O}}(d/α^2)$ genuine batches and works under fairly general assumptions on the distribution. The results demonstrate the utility of batch structure, which allows for the first polynomial time algorithm for list-decodable regression, which may be impossible for the non-batch setting, as suggested by a recent SQ lower bound \cite{diakonikolas2021statistical} for the non-batch setting.