CLAICVJun 5, 2025

MLLM-CL: Continual Learning for Multimodal Large Language Models

arXiv:2506.05453v216 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of continual learning for multimodal large language models, enabling better adaptation to evolving domains and abilities, though it is incremental as it builds on existing CL concepts.

The paper tackles the problem of multimodal large language models struggling to adapt to dynamic real-world scenarios by introducing MLLM-CL, a benchmark for continual learning, and a method using parameter isolation and routing, which significantly outperforms existing methods with minimal forgetting.

Recent Multimodal Large Language Models (MLLMs) excel in vision-language understanding but face challenges in adapting to dynamic real-world scenarios that require continuous integration of new knowledge and skills. While continual learning (CL) offers a potential solution, existing benchmarks and methods suffer from critical limitations. In this paper, we introduce MLLM-CL, a novel benchmark encompassing domain and ability continual learning, where the former focuses on independently and identically distributed (IID) evaluation across evolving mainstream domains, whereas the latter evaluates on non-IID scenarios with new model abilities. Methodologically, we propose preventing catastrophic interference through parameter isolation and an MLLM-based routing mechanism. Extensive experiments demonstrate that our approach can integrate domain-specific knowledge and functional abilities with minimal forgetting, significantly outperforming existing methods. Our benchmark and code are available at https://github.com/bjzhb666/MLLM-CL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes