HCAICLIRJan 29, 2024

KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

arXiv:2401.16454v18 citationsh-index: 14SCICHAT
Originality Incremental advance
AI Analysis

This work addresses the need for scalable and diverse user simulators to improve multi-turn instruction-following assistants, though it is incremental as it builds on existing simulator methods.

The paper tackles the problem of creating diverse user simulators for training language model assistants by introducing KAUCUS, a framework that incorporates external knowledge through retrieval augmentation or summary control, resulting in more helpful assistants as shown by reward and preference model evaluations.

An effective multi-turn instruction-following assistant can be developed by creating a simulator that can generate useful interaction data. Apart from relying on its intrinsic weights, an ideal user simulator should also be able to bootstrap external knowledge rapidly in its raw form to simulate the multifarious diversity of text available over the internet. Previous user simulators generally lacked diversity, were mostly closed domain, and necessitated rigid schema making them inefficient to rapidly scale to incorporate external knowledge. In this regard, we introduce, Kaucus, a Knowledge-Augmented User Simulator framework, to outline a process of creating diverse user simulators, that can seamlessly exploit external knowledge as well as benefit downstream assistant model training. Through two GPT-J based simulators viz., a Retrieval Augmented Simulator and a Summary Controlled Simulator we generate diverse simulator-assistant interactions. Through reward and preference model-based evaluations, we find that these interactions serve as useful training data and create more helpful downstream assistants. We also find that incorporating knowledge through retrieval augmentation or summary control helps create better assistants.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes