CLMay 17, 2025

Enhancing Complex Instruction Following for Large Language Models with Mixture-of-Contexts Fine-tuning

Yuheng Lu, ZiMeng Bai, Caixia Yuan, Huixing Jiang, Xiaojie Wang

arXiv:2505.11922v14.91 citationsh-index: 11

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in improving instruction-following for LLMs, offering a novel fine-tuning approach that is incremental but targeted at a known issue.

The paper tackled the problem of large language models struggling to consistently follow complex instructions with multiple constraints by proposing MISO, a mixture-of-contexts fine-tuning method that transforms sequential instructions into parallel sub-contexts, resulting in enhanced effectiveness in complex instruction-following scenarios and potential training efficiency gains.

Large language models (LLMs) exhibit remarkable capabilities in handling natural language tasks; however, they may struggle to consistently follow complex instructions including those involve multiple constraints. Post-training LLMs using supervised fine-tuning (SFT) is a standard approach to improve their ability to follow instructions. In addressing complex instruction following, existing efforts primarily focus on data-driven methods that synthesize complex instruction-output pairs for SFT. However, insufficient attention allocated to crucial sub-contexts may reduce the effectiveness of SFT. In this work, we propose transforming sequentially structured input instruction into multiple parallel instructions containing subcontexts. To support processing this multi-input, we propose MISO (Multi-Input Single-Output), an extension to currently dominant decoder-only transformer-based LLMs. MISO introduces a mixture-of-contexts paradigm that jointly considers the overall instruction-output alignment and the influence of individual sub-contexts to enhance SFT effectiveness. We apply MISO fine-tuning to complex instructionfollowing datasets and evaluate it with standard LLM inference. Empirical results demonstrate the superiority of MISO as a fine-tuning method for LLMs, both in terms of effectiveness in complex instruction-following scenarios and its potential for training efficiency.

View on arXiv PDF

Similar