MLLGJun 17, 2025

Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means

arXiv:2506.14673v34 citationsh-index: 3ICML
Originality Incremental advance
AI Analysis

This work addresses robust mean estimation for heavy-tailed data, which is crucial for machine learning applications like clustering and regression, but it appears incremental as it builds on existing Median-of-Means methods.

The paper tackles the problem of simultaneously estimating the means of functions in a class for heavy-tailed distributions with limited moments, achieving a new sample complexity bound. It applies this to improve results in k-means clustering with unbounded inputs and linear regression with general losses.

The Median of Means (MoM) is a mean estimator that has gained popularity in the context of heavy-tailed data. In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class $\mathcal{F}$ when the data distribution possesses only the first $p$ moments for $p \in (1,2]$. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to $k$-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes