HCJul 31, 2020

Evaluating Semantic Interaction on Word Embeddings via Simulation

Yali Bian, Michelle Dowling, Chris North

arXiv:2007.15824v19.64 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of replicable and objective evaluation for SI systems in text analysis, though it is incremental by focusing on a complementary method rather than a new paradigm.

The paper tackled the problem of evaluating semantic interaction (SI) systems by proposing a quantitative simulation-based approach to compare word embeddings and bag-of-words features, finding that word embeddings better capture user intents with improved accuracy in simulated tests.

Semantic interaction (SI) attempts to learn the user's cognitive intents as they directly manipulate data projections during sensemaking activity. For text analysis, prior implementations of SI have used common data features, such as bag-of-words representations, for machine learning from user interactions. Instead, we hypothesize that features derived from deep learning word embeddings will enable SI to better capture the user's subtle intents. However, evaluating these effects is difficult. SI systems are usually evaluated by a human-centred qualitative approach, by observing the utility and effectiveness of the application for end-users. This approach has drawbacks in terms of replicability, scalability, and objectiveness, which makes it hard to perform convincing contrast experiments between different SI models. To tackle this problem, we explore a quantitative algorithm-centered analysis as a complementary evaluation approach, by simulating users' interactions and calculating the accuracy of the learned model. We use these methods to compare word-embeddings to bag-of-words features for SI.

View on arXiv PDF

Similar