LG MLApr 20, 2020

Learning Entangled Single-Sample Distributions via Iterative Trimming

arXiv:2004.09563v24.28 citations

Originality Incremental advance

AI Analysis

This provides theoretical justification for iterative trimming in mean estimation and linear regression, addressing robustness in settings with entangled distributions, though it is incremental as it builds on existing methods.

The paper tackles the problem of estimating common parameters from single-sample distributions using iterative trimming, showing that the method's error depends only on the noise level of a constant fraction of the noisiest points, tolerating high-noise outliers.

In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear regression under general conditions, and analyze a simple and computationally efficient method based on iteratively trimming samples and re-estimating the parameter on the trimmed sample set. We show that the method in logarithmic iterations outputs an estimation whose error only depends on the noise level of the $\lceil αn \rceil$-th noisiest data point where $α$ is a constant and $n$ is the sample size. This means it can tolerate a constant fraction of high-noise points. These are the first such results for the method under our general conditions. It also justifies the wide application and empirical success of iterative trimming in practice. Our theoretical results are complemented by experiments on synthetic data.

View on arXiv PDF

Similar