AI CLOct 20, 2025

Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models

Dayan Pan, Zhaoyang Fu, Jingyuan Wang, Xiao Han, Yue Zhu, Xiangyu Zhao

arXiv:2510.17705v11 citationsh-index: 3Has CodeCIKM

Originality Highly original

AI Analysis

This work addresses the challenge of efficient multi-task adaptation for users of large language models, offering a novel method that improves performance while mitigating forgetting, though it is incremental in building on existing parameter-efficient techniques.

The paper tackles the problem of multi-task adaptation in large language models, which often leads to catastrophic forgetting and high resource costs, by proposing a Contextual Attention Modulation (CAM) mechanism and a HyCAM framework that dynamically modulates self-attention to balance task-specific specialization with general knowledge retention, resulting in an average performance improvement of 3.65% across heterogeneous tasks.

Large Language Models (LLMs) possess remarkable generalization capabilities but struggle with multi-task adaptation, particularly in balancing knowledge retention with task-specific specialization. Conventional fine-tuning methods suffer from catastrophic forgetting and substantial resource consumption, while existing parameter-efficient methods perform suboptimally in complex multi-task scenarios. To address this, we propose Contextual Attention Modulation (CAM), a novel mechanism that dynamically modulates the representations of self-attention modules in LLMs. CAM enhances task-specific features while preserving general knowledge, thereby facilitating more effective and efficient adaptation. For effective multi-task adaptation, CAM is integrated into our Hybrid Contextual Attention Modulation (HyCAM) framework, which combines a shared, full-parameter CAM module with multiple specialized, lightweight CAM modules, enhanced by a dynamic routing strategy for adaptive knowledge fusion. Extensive experiments on heterogeneous tasks, including question answering, code generation, and logical reasoning, demonstrate that our approach significantly outperforms existing approaches, achieving an average performance improvement of 3.65%. The implemented code and data are available to ease reproducibility at https://github.com/Applied-Machine-Learning-Lab/HyCAM.

View on arXiv PDF Code

Similar