CLHCOct 1, 2020

Predicting User Engagement Status for Online Evaluation of Intelligent Assistants

arXiv:2010.00656v21 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of evaluating intelligent assistants in online settings for developers and researchers, but it is incremental as it adapts existing user behavior-based evaluation methods to a new domain.

The paper tackles the challenge of online evaluation for intelligent assistants by proposing a framework to classify user engagement into four categories and designing metrics to quantify engagement levels, achieving performance comparisons across four real-world datasets.

Evaluation of intelligent assistants in large-scale and online settings remains an open challenge. User behavior-based online evaluation metrics have demonstrated great effectiveness for monitoring large-scale web search and recommender systems. Therefore, we consider predicting user engagement status as the very first and critical step to online evaluation for intelligent assistants. In this work, we first proposed a novel framework for classifying user engagement status into four categories -- fulfillment, continuation, reformulation and abandonment. We then demonstrated how to design simple but indicative metrics based on the framework to quantify user engagement levels. We also aim for automating user engagement prediction with machine learning methods. We compare various models and features for predicting engagement status using four real-world datasets. We conducted detailed analyses on features and failure cases to discuss the performance of current models as well as challenges.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes