CLFeb 19, 2024

SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning

arXiv:2402.11896v226 citationsh-index: 3ACL
AI Analysis

This addresses performance degradation in fine-tuning for users of large language models, but it is incremental as it builds on existing parameter-efficient fine-tuning techniques.

The paper tackled the problem of over-smoothing in Transformer-based large language models during parameter-efficient fine-tuning, resulting in up to 15.7% and 23.5% performance improvements on arithmetic and commonsense reasoning tasks.

Fine-tuning all parameters of large language models (LLMs) necessitates substantial computational power and extended time. Latest advancements in parameter-efficient fine-tuning (PEFT) techniques, such as Adapter tuning and LoRA, allow for adjustments to only a minor fraction of the parameters of these LLMs. Concurrently, it has been noted that the issue of over-smoothing diminishes the effectiveness of these Transformer-based LLMs, resulting in suboptimal performances in downstream tasks. In this paper, we present SIBO, which is a SImple BOoster to enhance PEFT, by injecting an initial residual. SIBO is straightforward and readily extensible to a range of state-of-the-art PEFT techniques to alleviate over-smoothing and enhance performance. Extensive experiments on 22 benchmark datasets demonstrate that SIBO significantly enhances the performance of various strong baselines, achieving up to 15.7% and 23.5% improvement over existing PEFT methods on the arithmetic and commonsense reasoning tasks, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes