CLOct 4, 2021

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

arXiv:2110.01518v1671 citations
Originality Incremental advance
AI Analysis

This addresses the problem of overfitting to spurious patterns in NLI for researchers, but it is incremental as it builds on existing debiasing and architectural methods.

The study investigated generalization in Natural Language Inference (NLI) by testing BERT-based models on the adversarial HANS dataset after training on MNLI, identifying 2 successful and 3 unsuccessful strategies to move beyond dataset-specific heuristics.

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes