CLOct 10, 2025

Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in Spoken Language Models

arXiv:2510.09592v18 citationsh-index: 22
Originality Highly original
AI Analysis

This addresses the challenge of enabling real-time, high-quality reasoning in spoken language models for applications requiring interactive dialogue, though it is an incremental improvement over existing think-while-speaking methods.

The paper tackles the problem of real-time Spoken Language Models struggling with Chain-of-Thought reasoning due to high latency by introducing Mind-Paced Speaking, a dual-brain framework that achieves reasoning performance comparable to pre-computed methods with drastically reduced latency, achieving 92.8% accuracy on Spoken-MQA and 82.5 on URO-Bench.

Real-time Spoken Language Models (SLMs) struggle to leverage Chain-of-Thought (CoT) reasoning due to the prohibitive latency of generating the entire thought process sequentially. Enabling SLMs to think while speaking, similar to humans, is attracting increasing attention. We present, for the first time, Mind-Paced Speaking (MPS), a brain-inspired framework that enables high-fidelity, real-time reasoning. Similar to how humans utilize distinct brain regions for thinking and responding, we propose a novel dual-brain approach, employing a "Formulation Brain" for high-level reasoning to pace and guide a separate "Articulation Brain" for fluent speech generation. This division of labor eliminates mode-switching, preserving the integrity of the reasoning process. Experiments show that MPS significantly outperforms existing think-while-speaking methods and achieves reasoning performance comparable to models that pre-compute the full CoT before speaking, while drastically reducing latency. Under a zero-latency configuration, the proposed method achieves an accuracy of 92.8% on the mathematical reasoning task Spoken-MQA and attains a score of 82.5 on the speech conversation task URO-Bench. Our work effectively bridges the gap between high-quality reasoning and real-time interaction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes