CLLGSep 25, 2024

PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning

arXiv:2409.16722v121 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the problem of efficient fine-tuning for large language models, offering a novel method that improves over existing approaches like LoRA, though it appears incremental in the context of parameter-efficient fine-tuning.

The paper tackles the limitations of low-rank adaptation (LoRA) in LLM fine-tuning by proposing PMSS, which selects skeletons from pre-trained weight matrices to enable high-rank updates with low costs, achieving performance gains such as +3.4% on DROP and +12.89% on GSM8K with fewer parameters.

Low-rank adaptation (LoRA) and its variants have recently gained much interest due to their ability to avoid excessive inference costs. However, LoRA still encounters the following challenges: (1) Limitation of low-rank assumption; and (2) Its initialization method may be suboptimal. To this end, we propose PMSS(Pre-trained Matrices Skeleton Selection), which enables high-rank updates with low costs while leveraging semantic and linguistic information inherent in pre-trained weight. It achieves this by selecting skeletons from the pre-trained weight matrix and only learning a small matrix instead. Experiments demonstrate that PMSS outperforms LoRA and other fine-tuning methods across tasks with much less trainable parameters. We demonstrate its effectiveness, especially in handling complex tasks such as DROP benchmark(+3.4%/+5.9% on LLaMA2-7B/13B) and math reasoning(+12.89%/+5.61%/+3.11% on LLaMA2-7B, Mistral-7B and Gemma-7B of GSM8K). The code and model will be released soon.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes