Spring-block theory of feature learning in deep neural networks

arXiv:2407.19353v44 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses a foundational problem in machine learning by providing a theoretical framework for feature learning in deep nets, which is incremental as it builds on existing theories.

The paper tackled the problem of understanding how feature learning emerges in deep neural networks from factors like nonlinearity and noise, and proposed a macroscopic mechanical theory that links feature learning across layers to generalization.

Feature-learning deep nets progressively collapse data to a regular low-dimensional geometry. How this emerges from the collective action of nonlinearity, noise, learning rate, and other factors, has eluded first-principles theories built from microscopic neuronal dynamics. We exhibit a noise-nonlinearity phase diagram that identifies regimes where shallow or deep layers learn more effectively and propose a macroscopic mechanical theory that reproduces the diagram and links feature learning across layers to generalization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes