Robust High-Dimensional Mean Estimation With Low Data Size, an Empirical Study
This work addresses a practical limitation in robust statistics for high-dimensional data analysis, but it is incremental as it focuses on empirical evaluation rather than proposing new methods.
The paper tackles the problem of robust mean estimation in high-dimensional settings with insufficient data size, conducting an empirical study to evaluate existing algorithms under these conditions.
Robust statistics aims to compute quantities to represent data where a fraction of it may be arbitrarily corrupted. The most essential statistic is the mean, and in recent years, there has been a flurry of theoretical advancement for efficiently estimating the mean in high dimensions on corrupted data. While several algorithms have been proposed that achieve near-optimal error, they all rely on large data size requirements as a function of dimension. In this paper, we perform an extensive experimentation over various mean estimation techniques where data size might not meet this requirement due to the high-dimensional setting.