CLAIJun 22, 2024

RankAdaptor: Hierarchical Rank Allocation for Efficient Fine-Tuning Pruned LLMs via Performance Model

arXiv:2406.15734v213 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient fine-tuning for compressed LLMs, which is crucial for deploying large models in resource-constrained environments, though it is incremental as it builds on existing pruning and LoRA techniques.

The paper tackles the problem of recovering performance in structurally pruned large language models (LLMs) by introducing RankAdaptor, a hierarchical rank allocation method that fine-tunes pruned LLMs based on layer-specific recovery needs, resulting in improvements of 0.7% to 5.5% over state-of-the-art methods across various benchmarks.

The efficient compression of large language models (LLMs) has become increasingly popular. However, recovering the performance of compressed LLMs remains a major challenge. The current practice in LLM compression entails the implementation of structural pruning, complemented by a recovery phase that leverages the Low-Rank Adaptation (LoRA) algorithm. Structural pruning's uneven modification of model architecture, coupled with standard LoRA's fixed configuration allocation across layers in an online pipeline, leads to suboptimal performance in various downstream tasks for pruned models. To address this challenge, we introduce RankAdaptor, a hierarchical rank allocation method that enables efficient fine-tuning of pruned LLMs according to layerwise specific recovery requirements. We employ a performance model that conducts offline meta-learning and online incremental learning to explore optimal rank values for each layer. Comprehensive experiments on popular benchmarks show that RankAdaptor consistently outperforms state-of-the-art methods across a variety of pruning settings and LLM architectures, with improvements ranging from 0.7\% to 5.5\%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes