LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters
This work addresses a fundamental issue in fine-tuning large models like LLMs and diffusion models, offering a novel geometric approach that could enhance efficiency and performance in machine learning applications.
The paper tackles the problem of parametrization ambiguity in Low-Rank Adaptation (LoRA) by introducing a fully Riemannian framework that optimizes low-rank adapters directly on the fixed-rank manifold, resulting in consistent improvements in convergence speed and final task performance over standard LoRA and its state-of-the-art modifications.
This work presents a novel, fully Riemannian framework for Low-Rank Adaptation (LoRA) that geometrically treats low-rank adapters by optimizing them directly on the fixed-rank manifold. This formulation eliminates the parametrization ambiguity present in standard Euclidean optimizers. Our framework integrates three key components to achieve this: (1) we derive Riemannion, a new Riemannian optimizer on the fixed-rank matrix manifold that generalizes the recently proposed Muon optimizer; (2) we develop a Riemannian gradient-informed LoRA initialization, and (3) we provide an efficient implementation without prominent overhead that uses automatic differentiation to compute arising geometric operations while adhering to best practices in numerical linear algebra. Comprehensive experimental results on both LLM and diffusion model architectures demonstrate that our approach yields consistent and noticeable improvements in convergence speed and final task performance over both standard LoRA and its state-of-the-art modifications.