CLJan 12, 2022

How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets

Aarne Talman, Marianna Apidianaki, Stergios Chatzikyriakidis, Jörg Tiedemann

arXiv:2201.04467v231.7627 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of evaluating model reasoning capabilities for NLU researchers, showing that current benchmarks may be flawed, but it is incremental as it builds on existing corruption studies.

The study investigated how data corruption affects natural language understanding models by testing pre-trained models on corrupted GLUE datasets, finding that performance remained high on most tasks, indicating models rely on cues beyond meaningful language.

A central question in natural language understanding (NLU) research is whether high performance demonstrates the models' strong reasoning capabilities. We present an extensive series of controlled experiments where pre-trained language models are exposed to data that have undergone specific corruption transformations. These involve removing instances of specific word classes and often lead to non-sensical sentences. Our results show that performance remains high on most GLUE tasks when the models are fine-tuned or tested on corrupted data, suggesting that they leverage other cues for prediction even in non-sensical contexts. Our proposed data transformations can be used to assess the extent to which a specific dataset constitutes a proper testbed for evaluating models' language understanding capabilities.

View on arXiv PDF Code

Similar