CLAIMay 24, 2025

ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models

arXiv:2505.18799v42 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This addresses efficiency issues in LLM alignment for practitioners, though it is incremental as it builds on prior work on attention head analysis.

The paper tackles the problem of high training costs when aligning large language models to downstream tasks by proposing ALPS, which localizes and prunes task-sensitive attention heads to reduce alignment costs. The method activates only 10% of attention parameters during fine-tuning while achieving a 2% performance improvement over baselines on three tasks.

Aligning general-purpose large language models (LLMs) to downstream tasks often incurs significant training adjustment costs. Prior research has explored various avenues to enhance alignment efficiency, primarily through minimal-data training or data-driven activations to identify key attention heads. However, these approaches inherently introduce data dependency, which hinders generalization and reusability. To address this issue and enhance model alignment efficiency, we propose the Attention Localization and Pruning Strategy (ALPS), an efficient algorithm that localizes the most task-sensitive attention heads and prunes by restricting attention training updates to these heads, thereby reducing alignment costs. Experimental results demonstrate that our method activates only 10% of attention parameters during fine-tuning while achieving a 2% performance improvement over baselines on three tasks. Moreover, the identified task-specific heads are transferable across datasets and mitigate knowledge forgetting. Our work and findings provide a novel perspective on efficient LLM alignment. The code is available at https://github.com/VoiceBeer/ALPS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes