CVJun 5, 2024

Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision

arXiv:2406.03051v214 citations
AI Analysis

This work addresses the problem of efficient adaptation of large foundation models for vision applications, representing a significant advance rather than an incremental improvement.

The paper tackles the challenge of balancing efficiency and generalization in parameter-efficient fine-tuning for vision tasks by introducing Adapter-X, which outperforms full fine-tuning in 2D image and 3D point cloud classification with only 0.20% and 1.88% of trainable parameters, respectively.

Parameter-efficient fine-tuning (PEFT) has become increasingly important as foundation models continue to grow in both popularity and size. Adapter has been particularly well-received due to their potential for parameter reduction and adaptability across diverse tasks. However, striking a balance between high efficiency and robust generalization across tasks remains a challenge for adapter-based methods. We analyze existing methods and find that: 1) parameter sharing is the key to reducing redundancy; 2) more tunable parameters, dynamic allocation, and block-specific design are keys to improving performance. Unfortunately, no previous work considers all these factors. Inspired by this insight, we introduce a novel framework named Adapter-X. First, a Sharing Mixture of Adapters (SMoA) module is proposed to fulfill token-level dynamic allocation, increased tunable parameters, and inter-block sharing at the same time. Second, some block-specific designs like Prompt Generator (PG) are introduced to further enhance the ability of adaptation. Extensive experiments across 2D image and 3D point cloud modalities demonstrate that Adapter-X represents a significant milestone as it is the first to outperform full fine-tuning in both 2D image and 3D point cloud modalities with significantly fewer parameters, i.e., only 0.20% and 1.88% of original trainable parameters for 2D and 3D classification tasks. Our code will be publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes