CLAIMay 23, 2023

Out-of-Distribution Generalization in Text Classification: Past, Present, and Future

arXiv:2305.14104v13 citations
Originality Synthesis-oriented
AI Analysis

This is a survey paper, so it is incremental in summarizing existing work rather than proposing new methods.

This paper addresses the lack of comprehensive surveys on out-of-distribution generalization in text classification by presenting the first review of recent progress, methods, and evaluations, aiming to encourage future research in this area.

Machine learning (ML) systems in natural language processing (NLP) face significant challenges in generalizing to out-of-distribution (OOD) data, where the test distribution differs from the training data distribution. This poses important questions about the robustness of NLP models and their high accuracy, which may be artificially inflated due to their underlying sensitivity to systematic biases. Despite these challenges, there is a lack of comprehensive surveys on the generalization challenge from an OOD perspective in text classification. Therefore, this paper aims to fill this gap by presenting the first comprehensive review of recent progress, methods, and evaluations on this topic. We furth discuss the challenges involved and potential future research directions. By providing quick access to existing work, we hope this survey will encourage future research in this area.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes