CLAISep 26, 2025

Synthetic Dialogue Generation for Interactive Conversational Elicitation & Recommendation (ICER)

arXiv:2510.02331v11 citationsh-index: 54Has Code
Originality Incremental advance
AI Analysis

This addresses data scarcity for conversational recommender systems, enabling better fine-tuning of language models, though it is incremental in improving simulation methods.

The paper tackled the lack of public data for conversational recommender systems by developing a method to generate synthetic dialogues that are consistent with user states, resulting in a large open-source dataset rated as highly consistent, factual, and natural.

While language models (LMs) offer great potential for conversational recommender systems (CRSs), the paucity of public CRS data makes fine-tuning LMs for CRSs challenging. In response, LMs as user simulators qua data generators can be used to train LM-based CRSs, but often lack behavioral consistency, generating utterance sequences inconsistent with those of any real user. To address this, we develop a methodology for generating natural dialogues that are consistent with a user's underlying state using behavior simulators together with LM-prompting. We illustrate our approach by generating a large, open-source CRS data set with both preference elicitation and example critiquing. Rater evaluation on some of these dialogues shows them to exhibit considerable consistency, factuality and naturalness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes