LGSep 10, 2021

Counterfactual Adversarial Learning with Representation Interpolation

arXiv:2109.04746v1661 citations
AI Analysis

This addresses the issue of model robustness and generalization for AI systems by mitigating spurious correlations, though it appears incremental as it builds on existing causal and adversarial methods.

The paper tackles the problem of deep learning models relying on spurious correlations in biased training data, especially in small data scenarios, by introducing the Counterfactual Adversarial Training (CAT) framework, which achieves substantial performance improvements over state-of-the-art methods in tasks like sentence classification, natural language inference, and question answering.

Deep learning models exhibit a preference for statistical fitting over logical reasoning. Spurious correlations might be memorized when there exists statistical bias in training data, which severely limits the model performance especially in small data scenarios. In this work, we introduce Counterfactual Adversarial Training framework (CAT) to tackle the problem from a causality perspective. Particularly, for a specific sample, CAT first generates a counterfactual representation through latent space interpolation in an adversarial manner, and then performs Counterfactual Risk Minimization (CRM) on each original-counterfactual pair to adjust sample-wise loss weight dynamically, which encourages the model to explore the true causal effect. Extensive experiments demonstrate that CAT achieves substantial performance improvement over SOTA across different downstream tasks, including sentence classification, natural language inference and question answering.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes