ML LGFeb 17, 2025

Robust High-Dimensional Mean Estimation With Low Data Size, an Empirical Study

arXiv:2502.11324v14.51 citationsh-index: 1Has CodeTrans. Mach. Learn. Res.

Originality Synthesis-oriented

AI Analysis

This work addresses a practical limitation in robust statistics for high-dimensional data analysis, but it is incremental as it focuses on empirical evaluation rather than proposing new methods.

The paper tackles the problem of robust mean estimation in high-dimensional settings with insufficient data size, conducting an empirical study to evaluate existing algorithms under these conditions.

Robust statistics aims to compute quantities to represent data where a fraction of it may be arbitrarily corrupted. The most essential statistic is the mean, and in recent years, there has been a flurry of theoretical advancement for efficiently estimating the mean in high dimensions on corrupted data. While several algorithms have been proposed that achieve near-optimal error, they all rely on large data size requirements as a function of dimension. In this paper, we perform an extensive experimentation over various mean estimation techniques where data size might not meet this requirement due to the high-dimensional setting.

View on arXiv PDF Code

Similar