NALGFeb 5, 2025

An Augmented Backward-Corrected Projector Splitting Integrator for Dynamical Low-Rank Training

arXiv:2502.03006v16 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses memory-efficient neural network training for researchers and practitioners, but it is incremental as it builds on existing dynamical low-rank methods.

The paper tackles the computational inefficiency of existing dynamical low-rank training methods by introducing a novel method that reduces QR decompositions, ensuring convergence to locally optimal solutions and demonstrating effectiveness across benchmarks.

Layer factorization has emerged as a widely used technique for training memory-efficient neural networks. However, layer factorization methods face several challenges, particularly a lack of robustness during the training process. To overcome this limitation, dynamical low-rank training methods have been developed, utilizing robust time integration techniques for low-rank matrix differential equations. Although these approaches facilitate efficient training, they still depend on computationally intensive QR and singular value decompositions of matrices with small rank. In this work, we introduce a novel low-rank training method that reduces the number of required QR decompositions. Our approach integrates an augmentation step into a projector-splitting scheme, ensuring convergence to a locally optimal solution. We provide a rigorous theoretical analysis of the proposed method and demonstrate its effectiveness across multiple benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes