CL LGAug 25, 2022

Shortcut Learning of Large Language Models in Natural Language Understanding

Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu

arXiv:2208.11857v213.7129 citationsh-index: 155

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of improving the reliability and robustness of LLMs for natural language understanding tasks, but it is incremental as it provides a review rather than new solutions.

The paper reviews the problem of shortcut learning in large language models (LLMs), where models rely on dataset biases and artifacts for predictions, affecting generalizability and robustness, and it surveys methods to identify, characterize, and mitigate this issue.

Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks. However, these LLMs might rely on dataset bias and artifacts as shortcuts for prediction. This has significantly affected their generalizability and adversarial robustness. In this paper, we provide a review of recent developments that address the shortcut learning and robustness challenge of LLMs. We first introduce the concepts of shortcut learning of language models. We then introduce methods to identify shortcut learning behavior in language models, characterize the reasons for shortcut learning, as well as introduce mitigation solutions. Finally, we discuss key research challenges and potential research directions in order to advance the field of LLMs.

View on arXiv PDF

Similar