LGMar 6

Weak-SIGReg: Covariance Regularization for Stable Deep Learning

arXiv:2603.05924v1h-index: 2Has Code
Predicted impact top 63% in LG · last 90 daysOriginality Incremental advance
AI Analysis

This addresses training instability in low-data or hack-free settings for deep learning practitioners, offering a general optimization stabilizer.

The paper tackles the problem of optimization collapse in neural networks like Vision Transformers and deep MLPs by proposing Weak-SIGReg, a covariance regularization method that stabilizes training, recovering ViT accuracy on CIFAR-100 from 20.73% to 72.02% and improving convergence.

Modern neural network optimization relies heavily on architectural priorssuch as Batch Normalization and Residual connectionsto stabilize training dynamics. Without these, or in low-data regimes with aggressive augmentation, low-bias architectures like Vision Transformers (ViTs) often suffer from optimization collapse. This work adopts Sketched Isotropic Gaussian Regularization (SIGReg), recently introduced in the LeJEPA self-supervised framework, and repurposes it as a general optimization stabilizer for supervised learning. While the original formulation targets the full characteristic function, a computationally efficient variant is derived, Weak-SIGReg, which targets the covariance matrix via random sketching. Inspired by interacting particle systems, representation collapse is viewed as stochastic drift; SIGReg constrains the representation density towards an isotropic Gaussian, mitigating this drift. Empirically, SIGReg recovers the training of a ViT on CIFAR-100 from a collapsed 20.73\% to 72.02\% accuracy without architectural hacks and significantly improves the convergence of deep vanilla MLPs trained with pure SGD. Code is available at \href{https://github.com/kreasof-ai/sigreg}{github.com/kreasof-ai/sigreg}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes