CLDec 13, 2024

ASLoRA: Adaptive Sharing Low-Rank Adaptation Across Layers

arXiv:2412.10135v23 citationsh-index: 41
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient fine-tuning for large language models, offering a more parameter-efficient solution that is incremental over existing methods like LoRA.

The paper tackles the high computational and storage costs of fine-tuning large language models by proposing ASLoRA, a parameter-efficient method that shares low-rank matrices across layers adaptively, achieving better performance than LoRA with less than 25% of the parameters.

As large language models (LLMs) grow in size, traditional full fine-tuning becomes increasingly impractical due to its high computational and storage costs. Although popular parameter-efficient fine-tuning methods, such as LoRA, have significantly reduced the number of tunable parameters, there is still room for further optimization. In this work, we propose ASLoRA, a cross-layer parameter-sharing strategy combining global sharing with partial adaptive sharing. Specifically, we share the low-rank matrix A across all layers and adaptively merge matrix B during training. This sharing mechanism not only mitigates overfitting effectively but also captures inter-layer dependencies, significantly enhancing the model's representational capability. We conduct extensive experiments on various NLP tasks, showing that ASLoRA outperforms LoRA while using less than 25% of the parameters, highlighting its flexibility and superior parameter efficiency. Furthermore, in-depth analyses of the adaptive sharing strategy confirm its significant advantages in enhancing both model flexibility and task adaptability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes