CHORUS: An Agentic Framework for Generating Realistic Deliberation Data
This work addresses the problem of limited deliberation data for researchers analyzing online discourse, though it appears incremental as it builds on existing agent and LLM methods.
The authors tackled the scarcity of large-scale deliberation data for online discourse analysis by proposing Chorus, an agentic framework that uses LLM-powered actors with consistent personas and a Poisson process-based temporal model to generate realistic discussions, which was evaluated by 30 experts and confirmed as a practical tool for high-quality data generation.
Understanding the intricate dynamics of online discourse depends on large-scale deliberation data, a resource that remains scarce across interactive web platforms due to restrictive accessibility policies, ethical concerns and inconsistent data quality. In this paper, we propose Chorus, an agentic framework, which orchestrates LLM-powered actors with behaviorally consistent personas to generate realistic deliberation discussions. Each actor is governed by an autonomous agent equipped with memory of the evolving discussion, while participation timing is governed by a principled Poisson process-based temporal model, which approximates the heterogeneous engagement patterns of real users. The framework is further supported by structured tool usage, enabling actors to access external resources and facilitating integration with interactive web platforms. The framework was deployed on the \textsc{Deliberate} platform and evaluated by 30 expert participants across three dimensions: content realism, discussion coherence and analytical utility, confirming Chorus as a practical tool for generating high-quality deliberation data suitable for online discourse analysis