Long-Tail Theory under Gaussian Mixtures
This work addresses the challenge of generalization in long-tailed distributions for machine learning practitioners, though it is incremental as it builds on existing theory.
The authors tackled the problem of learning from long-tailed data distributions by proposing a Gaussian mixture model that aligns with Feldman's long-tail theory, showing that nonlinear classifiers with memorization capacity outperform linear ones, with experiments confirming a performance gap that decreases as the tail shortens.
We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020). We demonstrate that a linear classifier cannot decrease the generalization error below a certain level in the proposed model, whereas a nonlinear classifier with a memorization capacity can. This confirms that for long-tailed distributions, rare training examples must be considered for optimal generalization to new data. Finally, we show that the performance gap between linear and nonlinear models can be lessened as the tail becomes shorter in the subpopulation frequency distribution, as confirmed by experiments on synthetic and real data.