LG CLMay 21, 2024

Modeling Real-Time Interactive Conversations as Timed Diarized Transcripts

Garrett Tanzer, Gustaf Ahdritz, Luke Melas-Kyriazi

arXiv:2405.13203v14.61 citationsh-index: 6Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of enabling real-time interactivity in chatbots for users, though it appears incremental as it builds on existing language models with a new decoding approach.

The paper tackles the limitation of chatbots to synchronous dialogues by introducing a method to simulate real-time interactive conversations using pretrained text-only language models, achieving generation rates of 30 tok/s for instant messenger and 20 tok/s for spoken conversations.

Chatbots built upon language models have exploded in popularity, but they have largely been limited to synchronous, turn-by-turn dialogues. In this paper we present a simple yet general method to simulate real-time interactive conversations using pretrained text-only language models, by modeling timed diarized transcripts and decoding them with causal rejection sampling. We demonstrate the promise of this method with two case studies: instant messenger dialogues and spoken conversations, which require generation at about 30 tok/s and 20 tok/s respectively to maintain real-time interactivity. These capabilities can be added into language models using relatively little data and run on commodity hardware.

View on arXiv PDF Code

Similar