LG AI CLJul 22, 2024

Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Kaiwen Wang, Rahul Kidambi, Ryan Sullivan, Alekh Agarwal, Christoph Dann, Andrea Michi, Marco Gelmi, Yunxuan Li, Raghav Gupta, Avinava Dubey, Alexandre Ramé, Johan Ferret

arXiv:2407.15762v231.240 citationsh-index: 38

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient multi-objective alignment in language models for AI researchers and practitioners, though it is incremental as it builds on existing techniques like multi-task training and parameter-efficient finetuning.

The paper tackles the challenge of developing steerable language models that can flexibly trade-off multiple conflicting objectives, such as creativity and safety, by introducing the Conditional Language Policy (CLP) framework, which outperforms and Pareto-dominates existing approaches on two summarization datasets without requiring multiple models for different trade-offs.

Reward-based finetuning is crucial for aligning language policies with intended behaviors (e.g., creativity and safety). A key challenge is to develop steerable language models that trade-off multiple (conflicting) objectives in a flexible and efficient manner. This paper presents Conditional Language Policy (CLP), a general framework for finetuning language models on multiple objectives. Building on techniques from multi-task training and parameter-efficient finetuning, CLP learn steerable models that effectively trade-off conflicting objectives at inference time. Notably, this does not require training or maintaining multiple models to achieve different trade-offs between the objectives. Through extensive experiments and ablations on two summarization datasets, we show that CLP learns steerable language models that outperform and Pareto-dominate the existing approaches for multi-objective finetuning.

View on arXiv PDF

Similar