LGAIJul 30, 2024

MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning

arXiv:2407.20999v412 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This addresses the issue of preserving general capabilities in LLMs during task-specific fine-tuning, particularly for scenarios where pre-training data is unavailable, though it is incremental as it builds on existing greedy block coordinate descent methods.

The paper tackles the problem of knowledge forgetting in large language models during fine-tuning, proposing the Momentum-Filtered Optimizer (MoFO) which achieves similar fine-tuning performance to default methods while effectively mitigating forgetting without requiring pre-training data.

Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks. Typically, LLMs are first pre-trained on large corpora and subsequently fine-tuned on task-specific datasets. However, during fine-tuning, LLMs may forget some knowledge acquired in the pre-training stage, leading to a decline in general capabilities. Existing approaches to mitigate forgetting often rely on access to pre-training data, which may be unavailable in many real-world scenarios--such as fine-tuning checkpoint-only open-source LLMs. To address this challenge, we propose a new fine-tuning algorithm termed Momentum-Filtered Optimizer (MoFO). MoFO is an extension of greedy block coordinate descent (BCD) methods: in each iteration, MoFO only updates the model parameters with the largest momentum magnitudes, while keeping all other parameters fixed. MoFO achieves similar fine-tuning performance to the default fine-tuning algorithm while effectively mitigating knowledge forgetting. We validate MoFO through rigorous convergence analysis and extensive experiments, demonstrating its effectiveness in mitigating forgetting without pre-training data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes