CLMay 24, 2022

Partial-input baselines show that NLI models can ignore context, but they don't

arXiv:2205.12181v131.8630 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of assessing model robustness in NLI for researchers, showing that models can overcome dataset artifacts, though it is incremental in confirming existing capabilities.

The study investigated whether state-of-the-art natural language inference (NLI) models can override default inferences from partial-input baselines, using a 600-example evaluation set with perturbed premises, and found that models are capable of learning to condition on context despite training on artifact-ridden datasets.

When strong partial-input baselines reveal artifacts in crowdsourced NLI datasets, the performance of full-input models trained on such datasets is often dismissed as reliance on spurious correlations. We investigate whether state-of-the-art NLI models are capable of overriding default inferences made by a partial-input baseline. We introduce an evaluation set of 600 examples consisting of perturbed premises to examine a RoBERTa model's sensitivity to edited contexts. Our results indicate that NLI models are still capable of learning to condition on context--a necessary component of inferential reasoning--despite being trained on artifact-ridden datasets.

View on arXiv PDF Code

Similar