LG AIOct 10, 2023

Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition

Zhongtian Chen, Edmund Lau, Jake Mendel, Susan Wei, Daniel Murfet

arXiv:2310.06301v119.224 citationsh-index: 7

Originality Incremental advance

AI Analysis

This work provides incremental evidence for the conjecture that SGD learning trajectories follow a sequential mechanism, relevant for researchers in machine learning theory and optimization.

The paper investigates phase transitions in a Toy Model of Superposition using Singular Learning Theory, deriving a closed formula for theoretical loss and showing that k-gon critical points determine phase transitions in Bayesian learning and SGD training behavior.

We investigate phase transitions in a Toy Model of Superposition (TMS) using Singular Learning Theory (SLT). We derive a closed formula for the theoretical loss and, in the case of two hidden dimensions, discover that regular $k$-gons are critical points. We present supporting theory indicating that the local learning coefficient (a geometric invariant) of these $k$-gons determines phase transitions in the Bayesian posterior as a function of training sample size. We then show empirically that the same $k$-gon critical points also determine the behavior of SGD training. The picture that emerges adds evidence to the conjecture that the SGD learning trajectory is subject to a sequential learning mechanism. Specifically, we find that the learning process in TMS, be it through SGD or Bayesian learning, can be characterized by a journey through parameter space from regions of high loss and low complexity to regions of low loss and high complexity.

View on arXiv PDF

Similar