CL AIJan 9, 2025

Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models

Qingyu Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Zeye Sun, Fei Yu

arXiv:2501.04945v414.79 citationsh-index: 22Has CodeACL

Originality Incremental advance

AI Analysis

This work addresses a specific, incremental improvement in LLM instruction-following for tasks involving soft constraints, which is relevant for applications requiring nuanced control over model outputs.

The paper tackles the problem of enhancing large language models' ability to follow soft constraints, such as instructions with multiple constraints, by developing a pipeline for automatic dataset construction and a curriculum learning training method based on constraint quantity, resulting in improved performance as evaluated experimentally.

It is crucial for large language models (LLMs) to follow instructions that involve multiple constraints. However, it is an unexplored area to enhance LLMs' ability to follow soft constraints. To bridge the gap, we initially design a pipeline to construct datasets with high-quality outputs automatically. Additionally, to fully utilize the positive and negative samples generated during the data construction process, we choose Direct Preference Optimization (DPO) as the training method. Furthermore, taking into account the difficulty of soft constraints indicated by the number of constraints, we design a curriculum learning training paradigm based on the constraint quantity. We experimentally evaluate the effectiveness of our methods in improving LLMs' soft constraint following ability and analyze the factors driving the improvements.The datasets and code are publicly available at https://github.com/Rainier-rq/FollowSoftConstraint.

View on arXiv PDF Code

Similar