CLAICVLGMMOct 18, 2023

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

arXiv:2310.12100v122 citationsh-index: 37
Originality Incremental advance
AI Analysis

This work addresses the adaptation and deployment problem for large-scale models in AI, offering a practical solution with reduced complexity, though it appears incremental as it builds on existing PEFT concepts.

The paper tackles the challenge of adapting large language and vision language models efficiently by proposing AdaLink, a non-intrusive parameter-efficient fine-tuning technique that achieves competitive performance compared to state-of-the-art intrusive methods and full fine-tuning on various text-only and multimodal tasks.

Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on a wide range of tasks by scaling up parameter counts from O(10^9) to O(10^{12}) levels and further beyond. These large scales make it impossible to adapt and deploy fully specialized models given a task of interest. Parameter-efficient fine-tuning (PEFT) emerges as a promising direction to tackle the adaptation and serving challenges for such large models. We categorize PEFT techniques into two types: intrusive and non-intrusive. Intrusive PEFT techniques directly change a model's internal architecture. Though more flexible, they introduce significant complexities for training and serving. Non-intrusive PEFT techniques leave the internal architecture unchanged and only adapt model-external parameters, such as embeddings for input. In this work, we describe AdaLink as a non-intrusive PEFT technique that achieves competitive performance compared to SoTA intrusive PEFT (LoRA) and full model fine-tuning (FT) on various tasks. We evaluate using both text-only and multimodal tasks, with experiments that account for both parameter-count scaling and training regime (with and without instruction tuning).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes