CLAIFeb 11

Conversational Behavior Modeling Foundation Model With Multi-Level Perception

arXiv:2602.11065v1h-index: 25
Originality Incremental advance
AI Analysis

This work addresses the challenge of building natural conversational AI systems, but it appears incremental as it builds on existing methods like transformers and graph-based reasoning for a specific domain.

The paper tackles the problem of modeling implicit thought chains in human conversation to improve full-duplex interactive systems, introducing a multi-level perception framework with a Graph-of-Thoughts that achieves robust behavior detection and interpretable reasoning in experiments on synthetic and real dialogues.

Human conversation is organized by an implicit chain of thoughts that manifests as timed speech acts. Capturing this perceptual pathway is key to building natural full-duplex interactive systems. We introduce a framework that models this process as multi-level perception, and then reasons over conversational behaviors via a Graph-of-Thoughts (GoT). Our approach formalizes the intent-to-action pathway with a hierarchical labeling scheme, predicting high-level communicative intents and low-level speech acts to learn their causal and temporal dependencies. To train this system, we develop a high quality corpus that pairs controllable, event-rich dialogue data with human-annotated labels. The GoT framework structures streaming predictions as an evolving graph, enabling a transformer to forecast the next speech act, generate concise justifications for its decisions, and dynamically refine its reasoning. Experiments on both synthetic and real duplex dialogues show that the framework delivers robust behavior detection, produces interpretable reasoning chains, and establishes a foundation for benchmarking conversational reasoning in full duplex spoken dialogue systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes