CLAug 6, 2025

Dialogue Response Prefetching Based on Semantic Similarity and Prediction Confidence of Language Model

Kiyotada Mori, Seiya Kawano, Angel Fernando Garcia Contreras, Koichiro Yoshino

arXiv:2508.04403v11 citationsh-index: 18INTERSPEECH

Originality Synthesis-oriented

AI Analysis

This work addresses latency reduction for users of spoken dialogue systems, but appears incremental as it builds on existing prefetching methods with a new confidence model.

The study tackled the problem of reducing user-perceived latency in spoken dialogue systems by predicting complete user utterances before speech ends, and proposed a prediction confidence model to determine prefetching feasibility based on semantic similarity, with evaluation based on differences between predicted and actual utterances.

Prefetching of dialogue responses has been investigated to reduce user-perceived latency (UPL), which refers to the user's waiting time before receiving the system's response, in spoken dialogue systems. To reduce the UPL, it is necessary to predict complete user utterances before the end of the user's speech, typically by language models, to prepare prefetched dialogue responses. In this study, we proposed a prediction confidence model (PCM) that determines whether prefetching is possible or not by estimating the semantic similarity between the predicted complete user utterance and the complete user utterance. We evaluated our PCM based on the differences between the predicted complete user utterance and the complete user utterance.

View on arXiv PDF

Similar