CLAILGJun 11, 2024

Joint Learning of Context and Feedback Embeddings in Spoken Dialogue

arXiv:2406.07291v14 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving feedback response modeling in dialogue systems, which is incremental as it builds on existing timing-focused approaches by incorporating lexical and prosodic form.

The paper tackled the problem of modeling short feedback responses in spoken dialogue by jointly embedding dialogue contexts and feedback responses using contrastive learning, resulting in a model that outperforms humans in ranking feedback appropriateness and captures conversational functions.

Short feedback responses, such as backchannels, play an important role in spoken dialogue. So far, most of the modeling of feedback responses has focused on their timing, often neglecting how their lexical and prosodic form influence their contextual appropriateness and conversational function. In this paper, we investigate the possibility of embedding short dialogue contexts and feedback responses in the same representation space using a contrastive learning objective. In our evaluation, we primarily focus on how such embeddings can be used as a context-feedback appropriateness metric and thus for feedback response ranking in U.S. English dialogues. Our results show that the model outperforms humans given the same ranking task and that the learned embeddings carry information about the conversational function of feedback responses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes