CLSep 5, 2022

A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension

Xanh Ho, Johannes Mario Meissner, Saku Sugawara, Akiko Aizawa

arXiv:2209.01824v21.47 citationsh-index: 32

Originality Synthesis-oriented

AI Analysis

This is an incremental survey that addresses shortcut issues in MRC, a key task for NLP researchers and developers aiming to improve model robustness.

This survey paper tackles the problem of shortcut learning in machine reading comprehension (MRC), summarizing techniques for measuring and mitigating unintended correlations that hinder advanced language understanding, and highlights concerns such as the lack of public challenge sets and certain mitigation techniques.

The issue of shortcut learning is widely known in NLP and has been an important research focus in recent years. Unintended correlations in the data enable models to easily solve tasks that were meant to exhibit advanced language understanding and reasoning capabilities. In this survey paper, we focus on the field of machine reading comprehension (MRC), an important task for showcasing high-level language understanding that also suffers from a range of shortcuts. We summarize the available techniques for measuring and mitigating shortcuts and conclude with suggestions for further progress in shortcut research. Importantly, we highlight two concerns for shortcut mitigation in MRC: (1) the lack of public challenge sets, a necessary component for effective and reusable evaluation, and (2) the lack of certain mitigation techniques that are prominent in other areas.

View on arXiv PDF

Similar