LGAINov 21, 2024

Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation

arXiv:2411.15224v33 citationsh-index: 9CVPR
Originality Incremental advance
AI Analysis

This addresses the problem of efficient adaptation of Mamba models for new tasks, offering a domain-specific solution that is incremental but novel for this architecture.

The paper tackles parameter-efficient fine-tuning for Mamba architecture by discovering that Projectors, not state-space models, are the primary contributors to transfer learning, and proposes ProDiaL, which optimizes only pretrained Projectors using diagonal-centric linear transformations, achieving strong performance with less than 1% of total parameters across vision and language models.

Despite the growing interest in Mamba architecture as a potential replacement for Transformer architecture, parameter-efficient fine-tuning (PEFT) approaches for Mamba remain largely unexplored. In our study, we introduce two key insights-driven strategies for PEFT in Mamba architecture: (1) While state-space models (SSMs) have been regarded as the cornerstone of Mamba architecture, then expected to play a primary role in transfer learning, our findings reveal that Projectors -- not SSMs -- are the predominant contributors to transfer learning. (2) Based on our observation, we propose a novel PEFT method specialized to Mamba architecture: Projector-targeted Diagonal-centric Linear Transformation (ProDiaL). ProDiaL focuses on optimizing only the pretrained Projectors for new tasks through diagonal-centric linear transformation matrices, without directly fine-tuning the Projector weights. This targeted approach allows efficient task adaptation, utilizing less than 1% of the total parameters, and exhibits strong performance across both vision and language Mamba models, highlighting its versatility and effectiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes