Score Before You Speak: Improving Persona Consistency in Dialogue Generation using Response Quality Scores
This work addresses the challenge of persona fidelity in conversational AI, offering a method to enhance consistency in dialogue generation for applications like chatbots and virtual assistants, though it is incremental as it builds on existing models.
The paper tackles the problem of maintaining persona consistency in dialogue generation by proposing the SBS framework, which trains models to correlate augmented responses with quality scores, resulting in improved performance on benchmark datasets like PERSONA-CHAT and ConvAI2.
Persona-based dialogue generation is an important milestone towards building conversational artificial intelligence. Despite the ever-improving capabilities of large language models (LLMs), effectively integrating persona fidelity in conversations remains challenging due to the limited diversity in existing dialogue data. We propose a novel framework SBS (Score-Before-Speaking), which outperforms previous methods and yields improvements for both million and billion-parameter models. Unlike previous methods, SBS unifies the learning of responses and their relative quality into a single step. The key innovation is to train a dialogue model to correlate augmented responses with a quality score during training and then leverage this knowledge at inference. We use noun-based substitution for augmentation and semantic similarity-based scores as a proxy for response quality. Through extensive experiments with benchmark datasets (PERSONA-CHAT and ConvAI2), we show that score-conditioned training allows existing models to better capture a spectrum of persona-consistent dialogues. Our ablation studies also demonstrate that including scores in the input prompt during training is superior to conventional training setups. Code and further details are available at https://arpita2512.github.io/score_before_you_speak