Outlier-Robust High-Dimensional Sparse Estimation via Iterative Filtering
This addresses the problem of robust sparse estimation for high-dimensional data analysis, offering a practically viable solution with incremental improvements over existing methods.
The paper tackles robust sparse mean estimation and robust sparse PCA in high-dimensional settings with adversarial corruptions, achieving near-optimal robustness guarantees and outperforming prior methods in experiments, nearly matching error rates without corruptions.
We study high-dimensional sparse estimation tasks in a robust setting where a constant fraction of the dataset is adversarially corrupted. Specifically, we focus on the fundamental problems of robust sparse mean estimation and robust sparse PCA. We give the first practically viable robust estimators for these problems. In more detail, our algorithms are sample and computationally efficient and achieve near-optimal robustness guarantees. In contrast to prior provable algorithms which relied on the ellipsoid method, our algorithms use spectral techniques to iteratively remove outliers from the dataset. Our experimental evaluation on synthetic data shows that our algorithms are scalable and significantly outperform a range of previous approaches, nearly matching the best error rate without corruptions.