CLMay 12

Output Composability of QLoRA PEFT Modules for Plug-and-Play Attribute-Controlled Text Generation

arXiv:2605.1234598.2

Predicted impact top 3% in CL · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the need for flexible, plug-and-play attribute control in text generation without retraining for every task combination, offering a practical solution for multi-task LLM deployment.

The paper investigates methods for generalizing QLoRA PEFT modules beyond single-task training, finding that summing outputs of separately trained modules consistently matches or outperforms alternatives, achieving a 2% average performance increase for sentiment control across three LLMs.

Parameter-efficient fine-tuning (PEFT) techniques offer task-specific fine-tuning at a fraction of the cost of full fine-tuning, but require separate fine-tuning for every new task (combination). In this paper, we explore three ways of generalising beyond single-task training/inference: (i) training on combinations of multiple, related datasets; (ii) at inference, composing the weight matrices of separately trained PEFT modules; and (iii) at inference, composing the outputs of separately trained PEFT modules. We test these approaches on three different LLMs, QLoRA as the PEFT technique, and three sets of controlled text generation datasets for sentiment control, topic control, and multi-attribute control. We find that summing PEFT module outputs is a particularly strong composition method, which consistently either outperforms or matches the performance of alternative approaches. This is the case even when comparing against single-task specialised modules on the single-task test set, where three-module output composition achieves an average 2% point performance increase across all models for sentiment control.

View on arXiv PDF

Similar