CLAILGASDec 5, 2022

A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS

Tsinghua
arXiv:2212.03817v16 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses user experience issues in large-scale commercial dialogue systems by reducing errors from misunderstandings, though it is incremental as it builds on existing satisfaction prediction methods.

The paper tackles the problem of predicting user satisfaction in a commercial spoken dialogue system (DuerOS) to enable proactive interactions, resulting in a 19% relative improvement in prediction accuracy and a 2.3% relative improvement in user experience.

Recently, spoken dialogue systems have been widely deployed in a variety of applications, serving a huge number of end-users. A common issue is that the errors resulting from noisy utterances, semantic misunderstandings, or lack of knowledge make it hard for a real system to respond properly, possibly leading to an unsatisfactory user experience. To avoid such a case, we consider a proactive interaction mechanism where the system predicts the user satisfaction with the candidate response before giving it to the user. If the user is not likely to be satisfied according to the prediction, the system will ask the user a suitable question to determine the real intent of the user instead of providing the response directly. With such an interaction with the user, the system can give a better response to the user. Previous models that predict the user satisfaction are not applicable to DuerOS which is a large-scale commercial dialogue system. They are based on hand-crafted features and thus can hardly learn the complex patterns lying behind millions of conversations and temporal dependency in multiple turns of the conversation. Moreover, they are trained and evaluated on the benchmark datasets with adequate labels, which are expensive to obtain in a commercial dialogue system. To face these challenges, we propose a pipeline to predict the user satisfaction to help DuerOS decide whether to ask for clarification in each turn. Specifically, we propose to first generate a large number of weak labels and then train a transformer-based model to predict the user satisfaction with these weak labels. Empirically, we deploy and evaluate our model on DuerOS, and observe a 19% relative improvement on the accuracy of user satisfaction prediction and 2.3% relative improvement on user experience.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes