CLOct 4, 2021

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Prajjwal Bhargava, Aleksandr Drozd, Anna Rogers

arXiv:2110.01518v131.5671 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of overfitting to spurious patterns in NLI for researchers, but it is incremental as it builds on existing debiasing and architectural methods.

The study investigated generalization in Natural Language Inference (NLI) by testing BERT-based models on the adversarial HANS dataset after training on MNLI, identifying 2 successful and 3 unsuccessful strategies to move beyond dataset-specific heuristics.

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

View on arXiv PDF Code

Similar