CL CV HC LGSep 22, 2020

ALICE: Active Learning with Contrastive Natural Language Explanations

arXiv:2009.10259v131.41024 citations

Originality Incremental advance

AI Analysis

This work addresses the high cost of data annotation for machine learning practitioners by improving data efficiency through expert explanations, though it is incremental as it builds on active learning and explanation methods.

The paper tackles the problem of costly data annotation for supervised neural networks by proposing ALICE, a framework that uses active learning to elicit contrastive natural language explanations from experts, which improves data efficiency. The result shows that models with ALICE outperform baselines trained with 40-100% more data, and adding one explanation yields performance gains equivalent to 13-30 additional labeled data points.

Training a supervised neural network classifier typically requires many annotated training samples. Collecting and annotating a large number of data points are costly and sometimes even infeasible. Traditional annotation process uses a low-bandwidth human-machine communication interface: classification labels, each of which only provides several bits of information. We propose Active Learning with Contrastive Explanations (ALICE), an expert-in-the-loop training framework that utilizes contrastive natural language explanations to improve data efficiency in learning. ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations from experts. Then it extracts knowledge from these explanations using a semantic parser. Finally, it incorporates the extracted knowledge through dynamically changing the learning model's structure. We applied ALICE in two visual recognition tasks, bird species classification and social relationship classification. We found by incorporating contrastive explanations, our models outperform baseline models that are trained with 40-100% more training data. We found that adding 1 explanation leads to similar performance gain as adding 13-30 labeled training data points.

View on arXiv PDF

Similar