CLAISep 30, 2022

Underspecification in Language Modeling Tasks: A Causality-Informed Study of Gendered Pronoun Resolution

arXiv:2210.00131v41 citationsh-index: 7Has Code
Originality Incremental advance
AI Analysis

This work addresses underspecification issues in language models, which can lead to spurious correlations, but it is incremental as it builds on existing causality and evaluation frameworks.

The study tackled the problem of underspecification in language modeling tasks, specifically gendered pronoun resolution, by introducing a causal model and two evaluation methods, revealing previously unreported spurious correlations across various LLMs like BERT-base to GPT-4 Turbo Preview.

Modern language modeling tasks are often underspecified: for a given token prediction, many words may satisfy the user's intent of producing natural language at inference time, however only one word will minimize the task's loss function at training time. We introduce a simple causal mechanism to describe the role underspecification plays in the generation of spurious correlations. Despite its simplicity, our causal model directly informs the development of two lightweight black-box evaluation methods, that we apply to gendered pronoun resolution tasks on a wide range of LLMs to 1) aid in the detection of inference-time task underspecification by exploiting 2) previously unreported gender vs. time and gender vs. location spurious correlations on LLMs with a range of A) sizes: from BERT-base to GPT-4 Turbo Preview, B) pre-training objectives: from masked & autoregressive language modeling to a mixture of these objectives, and C) training stages: from pre-training only to reinforcement learning from human feedback (RLHF). Code and open-source demos available at https://github.com/2dot71mily/uspec.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes