CLAug 16, 2018

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

arXiv:1808.05326v11511 citations
Originality Incremental advance
AI Analysis

This addresses the need for robust benchmarks in AI for commonsense reasoning, though it is incremental in dataset creation.

The authors tackled the problem of grounded commonsense inference by introducing SWAG, a large-scale dataset with 113k multiple-choice questions, and used Adversarial Filtering to reduce biases, resulting in models struggling while humans achieved 88% accuracy.

Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). In this paper, we introduce the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning. We present SWAG, a new dataset with 113k multiple choice questions about a rich spectrum of grounded situations. To address the recurring challenges of the annotation artifacts and human biases found in many existing datasets, we propose Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data. To account for the aggressive adversarial filtering, we use state-of-the-art language models to massively oversample a diverse set of potential counterfactuals. Empirical results demonstrate that while humans can solve the resulting inference problems with high accuracy (88%), various competitive models struggle on our task. We provide comprehensive analysis that indicates significant opportunities for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes