LGJan 1

Task-Driven Kernel Flows: Label Rank Compression and Laplacian Spectral Filtering

arXiv:2601.00276v12.71 citationsh-index: 2

Originality Highly original

AI Analysis

This provides a theoretical foundation for understanding compression in supervised learning, which is incremental but clarifies differences from self-supervised methods.

The paper tackles the problem of understanding feature learning in wide neural networks by showing that supervised learning inherently compresses the kernel rank to the number of classes, with SGD noise similarly confined to a low-rank subspace, contrasting with high-rank representations in self-supervised learning.

We present a theory of feature learning in wide L2-regularized networks showing that supervised learning is inherently compressive. We derive a kernel ODE that predicts a "water-filling" spectral evolution and prove that for any stable steady state, the kernel rank is bounded by the number of classes ($C$). We further demonstrate that SGD noise is similarly low-rank ($O(C)$), confining dynamics to the task-relevant subspace. This framework unifies the deterministic and stochastic views of alignment and contrasts the low-rank nature of supervised learning with the high-rank, expansive representations of self-supervision.

View on arXiv PDF

Similar