LGOCMLMay 30

Exploiting weight-space symmetries for approximating curvature

arXiv:2606.0044213.8h-index: 17
Predicted impact top 42% in LG · last 90 daysOriginality Highly original
AI Analysis

This work provides a novel theoretical and practical approach to curvature approximation in deep learning, offering a unifying lens for existing methods and potential applications across multiple ML subfields.

The authors propose a framework that exploits weight-space symmetries in loss landscapes to construct structured Hessian approximations from single gradients, enabling tractable curvature estimation for deep networks. They validate the method on various architectures and demonstrate its use in second-order optimization, including for a small language model.

Many machine learning techniques rely on approximating a loss function's curvature, but this is notoriously hard to do at the scale of modern deep networks. Surprisingly, no previous work has exploited the curvature constraints that arise from well known weight-space symmetries in loss landscapes. By analytically averaging over group actions that leave the loss invariant, we construct structured Hessian approximations from single gradients that can be tractably estimated, stored, and inverted. The choice of user-specified symmetry group directly governs the trade-off between approximation accuracy and computational cost. Moreover, our framework provides a unifying theoretical lens for viewing existing methods; in particular, a specific choice of symmetry group recovers Shampoo/Muon-like curvature estimates. We validate our method on a range of network architectures, and deploy it to second-order optimization benchmarks, including a small language model. Our curvature estimation framework might find applications in other machine learning problems such as uncertainty estimation, continual learning, compression/pruning, training data attribution, and more.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes