LGAIOct 10, 2023

Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition

arXiv:2310.06301v124 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work provides incremental evidence for the conjecture that SGD learning trajectories follow a sequential mechanism, relevant for researchers in machine learning theory and optimization.

The paper investigates phase transitions in a Toy Model of Superposition using Singular Learning Theory, deriving a closed formula for theoretical loss and showing that k-gon critical points determine phase transitions in Bayesian learning and SGD training behavior.

We investigate phase transitions in a Toy Model of Superposition (TMS) using Singular Learning Theory (SLT). We derive a closed formula for the theoretical loss and, in the case of two hidden dimensions, discover that regular $k$-gons are critical points. We present supporting theory indicating that the local learning coefficient (a geometric invariant) of these $k$-gons determines phase transitions in the Bayesian posterior as a function of training sample size. We then show empirically that the same $k$-gon critical points also determine the behavior of SGD training. The picture that emerges adds evidence to the conjecture that the SGD learning trajectory is subject to a sequential learning mechanism. Specifically, we find that the learning process in TMS, be it through SGD or Bayesian learning, can be characterized by a journey through parameter space from regions of high loss and low complexity to regions of low loss and high complexity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes