CLAIMay 9, 2018

Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task

arXiv:1805.06511v124 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of end-of-turn detection for automated dialogue systems, but it is incremental as it builds on known influences of speaker intentions on turn-taking.

The paper tackles the problem of predicting turn-transitions in spoken dialogues by proposing a multi-task neural approach that simultaneously predicts speaker intentions as an auxiliary task, resulting in improved performance without needing extra features at run-time.

This work focuses on the use of acoustic cues for modeling turn-taking in dyadic spoken dialogues. Previous work has shown that speaker intentions (e.g., asking a question, uttering a backchannel, etc.) can influence turn-taking behavior and are good predictors of turn-transitions in spoken dialogues. However, speaker intentions are not readily available for use by automated systems at run-time; making it difficult to use this information to anticipate a turn-transition. To this end, we propose a multi-task neural approach for predicting turn- transitions and speaker intentions simultaneously. Our results show that adding the auxiliary task of speaker intention prediction improves the performance of turn-transition prediction in spoken dialogues, without relying on additional input features during run-time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes