CLApr 26, 2025

Detect, Explain, Escalate: Low-Carbon Dialogue Breakdown Management for LLM-Powered Agents

Abdellah Ghassel, Xianzhi Li, Xiaodan Zhu

arXiv:2504.18839v2h-index: 1IEEE Transactions on Audio, Speech, and Language Processing

Originality Incremental advance

AI Analysis

It addresses the critical challenge of user trust in conversational AI for high-impact domains, offering an incremental but efficient solution.

This paper tackles the problem of conversational breakdowns in LLM-powered agents by introducing a 'Detect, Explain, Escalate' framework, which improves accuracy by 7% over baselines and reduces inference costs by 54%.

While Large Language Models (LLMs) are transforming numerous applications, their susceptibility to conversational breakdowns remains a critical challenge undermining user trust. This paper introduces a "Detect, Explain, Escalate" framework to manage dialogue breakdowns in LLM-powered agents, emphasizing low-carbon operation. Our approach integrates two key strategies: (1) We fine-tune a compact 8B-parameter model, augmented with teacher-generated reasoning traces, which serves as an efficient real-time breakdown 'detector' and 'explainer'. This model demonstrates robust classification and calibration on English and Japanese dialogues, and generalizes well to the BETOLD dataset, improving accuracy by 7% over its baseline. (2) We systematically evaluate frontier LLMs using advanced prompting (few-shot, chain-of-thought, analogical reasoning) for high-fidelity breakdown assessment. These are integrated into an 'escalation' architecture where our efficient detector defers to larger models only when necessary, substantially reducing operational costs and energy consumption. Our fine-tuned model and prompting strategies establish new state-of-the-art results on dialogue breakdown detection benchmarks, outperforming specialized classifiers and significantly narrowing the performance gap to larger proprietary models. The proposed monitor-escalate pipeline reduces inference costs by 54%, offering a scalable, efficient, and more interpretable solution for robust conversational AI in high-impact domains. Code and models will be publicly released.

View on arXiv PDF

Similar