LG AIMar 2

The Malignant Tail: Spectral Segregation of Label Noise in Over-Parameterized Networks

arXiv:2603.02293v1h-index: 1

Originality Incremental advance

AI Analysis

This addresses robust generalization for deep learning practitioners facing label noise, offering a stable post-hoc method to mitigate memorization, though it is incremental as it builds on existing spectral analysis and noise theory.

The paper tackles the problem of harmful overfitting in over-parameterized networks under label noise by identifying the Malignant Tail mechanism, where networks segregate noise into high-frequency subspaces, and shows that explicit spectral truncation can recover optimal generalization, achieving results comparable to noise-free training.

While implicit regularization facilitates benign overfitting in low-noise regimes, recent theoretical work predicts a sharp phase transition to harmful overfitting as the noise-to-signal ratio increases. We experimentally isolate the geometric mechanism of this transition: the Malignant Tail, a failure mode where networks functionally segregate signal and noise, reducing coherent semantic features into low-rank subspaces while pushing stochastic label noise into high-frequency orthogonal components, distinct from systematic or corruption-aligned noise. Through a Spectral Linear Probe of training dynamics, we demonstrate that Stochastic Gradient Descent (SGD) fails to suppress this noise, instead implicitly biasing it toward high-frequency orthogonal subspaces, effectively preserving signal-noise separability. We show that this geometric separation is distinct from simple variance reduction in untrained models. In trained networks, SGD actively segregates noise, allowing post-hoc Explicit Spectral Truncation (d << D) to surgically prune the noise-dominated subspace. This approach recovers the optimal generalization capability latent in the converged model. Unlike unstable temporal early stopping, Geometric Truncation provides a stable post-hoc intervention. Our findings suggest that under label noise, excess spectral capacity is not harmless redundancy but a latent structural liability that allows for noise memorization, necessitating explicit rank constraints to filter stochastic corruptions for robust generalization.

View on arXiv PDF

Similar