CLMay 2, 2018

Hypothesis Only Baselines in Natural Language Inference

Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, Benjamin Van Durme

arXiv:1805.01042v135.51416 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This reveals potential flaws in NLI datasets for researchers, making it incremental as it builds on existing diagnostic methods.

The paper tackles the problem of diagnosing Natural Language Inference (NLI) datasets by proposing a hypothesis-only baseline that ignores context, finding it significantly outperforms a majority class baseline across ten datasets, indicating statistical irregularities.

We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on ten distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

View on arXiv PDF Code

Similar