LGAIMLFeb 5, 2024

Flora: Low-Rank Adapters Are Secretly Gradient Compressors

arXiv:2402.03293v2118 citationsh-index: 4ICML
AI Analysis

This work addresses memory constraints in training large models for AI researchers and practitioners, offering an incremental improvement over LoRA.

The paper tackles the memory inefficiency of training large neural networks by analyzing low-rank adaptation (LoRA) and identifying it as a random projection, leading to the proposal of Flora, which achieves high-rank updates with sublinear space complexity, improving model performance across tasks and architectures.

Despite large neural networks demonstrating remarkable abilities to complete different tasks, they require excessive memory usage to store the optimization states for training. To alleviate this, the low-rank adaptation (LoRA) is proposed to reduce the optimization states by training fewer parameters. However, LoRA restricts overall weight update matrices to be low-rank, limiting the model performance. In this work, we investigate the dynamics of LoRA and identify that it can be approximated by a random projection. Based on this observation, we propose Flora, which is able to achieve high-rank updates by resampling the projection matrices while enjoying the sublinear space complexity of optimization states. We conduct experiments across different tasks and model architectures to verify the effectiveness of our approach.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes