CLAIMay 4, 2023

CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing

arXiv:2305.02865v1
Originality Incremental advance
AI Analysis

It addresses generalization issues in NLU models for researchers and practitioners, though it is incremental as it builds on existing causal inference and feature disentanglement methods.

The paper tackled dataset bias in NLU models by proposing CausalAPM, a framework that disentangles literal and semantic features to improve generalization, achieving significant OOD performance gains on benchmarks like MNLI, FEVER, and QQP while maintaining ID performance.

Dataset bias, i.e., the over-reliance on dataset-specific literal heuristics, is getting increasing attention for its detrimental effect on the generalization ability of NLU models. Existing works focus on eliminating dataset bias by down-weighting problematic data in the training process, which induce the omission of valid feature information while mitigating bias. In this work, We analyze the causes of dataset bias from the perspective of causal inference and propose CausalAPM, a generalizable literal disentangling framework to ameliorate the bias problem from feature granularity. The proposed approach projects literal and semantic information into independent feature subspaces, and constrains the involvement of literal information in subsequent predictions. Extensive experiments on three NLP benchmarks (MNLI, FEVER, and QQP) demonstrate that our proposed framework significantly improves the OOD generalization performance while maintaining ID performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes