CLAILGJan 14, 2025

A Multi-Encoder Frozen-Decoder Approach for Fine-Tuning Large Language Models

arXiv:2501.07818v12 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing deployment overhead and improving portability for fine-tuning in NLP, though it appears incremental as it builds on existing freezing strategies.

The paper tackled the problem of fine-tuning large language models efficiently by investigating a multi-encoder frozen-decoder approach, finding it effective for natural language tasks and mitigating catastrophic forgetting in multilingual setups, with performance maintained or enhanced in structured and QA tasks when paired with larger models.

Among parameter-efficient fine-tuning methods, freezing has emerged as a popular strategy for speeding up training, reducing catastrophic forgetting, and improving downstream performance. We investigate the impact of freezing the decoder in a multi-task setup comprising diverse natural language tasks, aiming to reduce deployment overhead and enhance portability to novel tasks. Our experiments, conducted by fine-tuning both individual and multi-task setups on the AlexaTM model, reveal that freezing decoders is highly effective for tasks with natural language outputs and mitigates catastrophic forgetting in multilingual tasks. However, we find that pairing frozen decoders with a larger model can effectively maintain or even enhance performance in structured and QA tasks, making it a viable strategy for a broader range of task types.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes