LGMay 30

Rethinking Bregman Divergences in Kronecker-Factored Optimizers

arXiv:2606.0054283.9
AI Analysis

For practitioners of large-scale optimization, this work provides a principled way to design more efficient Kronecker-factored preconditioners by leveraging spectral properties of the covariance matrix.

The paper analyzes how different Bregman divergences distribute Kronecker approximation errors across the covariance spectrum in Shampoo-style optimizers, and proposes a subspace-aware optimizer that improves convergence by applying eigenvalue-based preconditioning in the top eigenspace and adaptive isotropic acceleration in the bottom subspace.

Shampoo-style optimizers approximate gradient covariance matrices using Kronecker-factored structures. Recent work~\cite{lin2026understanding} showed that such approximations can be viewed as projections under Bregman matrix divergences, leading to different Kronecker-factored preconditioners. However, it remains unclear what role the choice of divergence plays when the covariance is not exactly Kronecker-factored. We study this question through the spectrum of the covariance matrix. We show that Frobenius, von Neumann, and LogDet divergences distribute the unavoidable Kronecker approximation error differently across the covariance spectrum. We further show that their Kronecker factors are governed by divergence-weighted residuals rather than the raw approximation error, explaining how these spectral preferences are realized in the resulting preconditioners. Empirically, we observe that the top covariance eigenspace is substantially better aligned with the Hessian matrix, while the tail spectrum is much noisier and unreliable. Motivated by these findings, we propose a subspace-aware Kronecker optimizer that applies eigenvalue-based preconditioning in the top subspace and uses an adaptive isotropic acceleration constant in the bottom subspace.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes