STCRLGMay 26, 2023

Robust Nonparametric Regression under Poisoning Attack

arXiv:2305.16771v213 citations
Originality Highly original
AI Analysis

This work addresses the problem of ensuring regression robustness against data poisoning for machine learning practitioners, offering a novel correction method to improve upon initial vulnerabilities.

The paper tackles robust nonparametric regression under adversarial poisoning attacks, where an attacker can modify up to q samples in a training dataset. It proposes an M-estimator with Huber loss and a correction method via Lipschitz projection, achieving nearly minimax optimal error rates for arbitrary q, up to a ln N factor.

This paper studies robust nonparametric regression, in which an adversarial attacker can modify the values of up to $q$ samples from a training dataset of size $N$. Our initial solution is an M-estimator based on Huber loss minimization. Compared with simple kernel regression, i.e. the Nadaraya-Watson estimator, this method can significantly weaken the impact of malicious samples on the regression performance. We provide the convergence rate as well as the corresponding minimax lower bound. The result shows that, with proper bandwidth selection, $\ell_\infty$ error is minimax optimal. The $\ell_2$ error is optimal with relatively small $q$, but is suboptimal with larger $q$. The reason is that this estimator is vulnerable if there are many attacked samples concentrating in a small region. To address this issue, we propose a correction method by projecting the initial estimate to the space of Lipschitz functions. The final estimate is nearly minimax optimal for arbitrary $q$, up to a $\ln N$ factor.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes