LGAug 18, 2023

Latent State Models of Training Dynamics

CMUHarvard
arXiv:2308.09543v320 citationsh-index: 96
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of interpreting training variability for machine learning practitioners, but it is incremental as it applies existing HMM methods to new training data.

The paper tackled the problem of understanding how randomness affects neural network training dynamics by analyzing metrics like weight norms across multiple training runs with different seeds, and found that fitting a hidden Markov model (HMM) reveals latent states and phase transitions, such as 'detour' states that slow convergence, in tasks like grokking and image classification.

The impact of randomness on model training is poorly understood. How do differences in data order and initialization actually manifest in the model, such that some training runs outperform others or converge faster? Furthermore, how can we interpret the resulting training dynamics and the phase transitions that characterize different trajectories? To understand the effect of randomness on the dynamics and outcomes of neural network training, we train models multiple times with different random seeds and compute a variety of metrics throughout training, such as the $L_2$ norm, mean, and variance of the neural network's weights. We then fit a hidden Markov model (HMM) over the resulting sequences of metrics. The HMM represents training as a stochastic process of transitions between latent states, providing an intuitive overview of significant changes during training. Using our method, we produce a low-dimensional, discrete representation of training dynamics on grokking tasks, image classification, and masked language modeling. We use the HMM representation to study phase transitions and identify latent "detour" states that slow down convergence.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes