CLSep 5, 2022

A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension

arXiv:2209.01824v27 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This is an incremental survey that addresses shortcut issues in MRC, a key task for NLP researchers and developers aiming to improve model robustness.

This survey paper tackles the problem of shortcut learning in machine reading comprehension (MRC), summarizing techniques for measuring and mitigating unintended correlations that hinder advanced language understanding, and highlights concerns such as the lack of public challenge sets and certain mitigation techniques.

The issue of shortcut learning is widely known in NLP and has been an important research focus in recent years. Unintended correlations in the data enable models to easily solve tasks that were meant to exhibit advanced language understanding and reasoning capabilities. In this survey paper, we focus on the field of machine reading comprehension (MRC), an important task for showcasing high-level language understanding that also suffers from a range of shortcuts. We summarize the available techniques for measuring and mitigating shortcuts and conclude with suggestions for further progress in shortcut research. Importantly, we highlight two concerns for shortcut mitigation in MRC: (1) the lack of public challenge sets, a necessary component for effective and reusable evaluation, and (2) the lack of certain mitigation techniques that are prominent in other areas.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes