LG CLMay 4

Attribution-Guided Masking for Robust Cross-Domain Sentiment Classification

Shubham Harkare, Arvind Yogesh Suresh Babu, Yash Kulkarni

arXiv:2605.030914.7

AI Analysis

For practitioners in sentiment analysis, AGM offers a training-time method to improve cross-domain robustness without target-domain labels, though gains are incremental over existing domain generalization methods.

The paper tackles the problem of performance degradation in pre-trained Transformers when transferring to out-of-domain sentiment data. It proposes Attribution-Guided Masking (AGM), which achieves competitive generalization (e.g., Δ=0.244 on Sentiment140) compared to strong baselines while providing token-level interpretability.

While pre-trained Transformer models achieve high accuracy on in-domain sentiment classification, they frequently experience severe performance degradation when transferring to out-of-domain data. We hypothesize that this generalization gap is driven by reliance on domain-specific spurious tokens. After demonstrating that post-hoc-token-level attribution drift fails to predict this gap, we propose Attribution-Guided Masking (AGM), a training time intervention that dynamically detects and penalizes highly attributed spurious tokens during fine-tuning. AGM's core component is a gradient based attribution masking loss ($\mathcal{L}_{mask}$), which can optionally be combined with a counterfactual contrastive loss to enforce domain-invariant representations, all without requiring target-domain labels or human annotation. Evaluated in a strict zero-shot transfer setting across four diverse domains with eight random seeds, AGM achieves competitive generalization compared to five strong baselines on the hardest transfer (Sentiment140): $Δ$ = 0.244 versus DANN (0.264), DRO (0.248), Fish (0.247), and IRM (0.238), while uniquely providing token-level interpretability into which features drive the generalization gap. Our qualitative analysis confirms that AGM suppresses attribution on domain-specific tokens such as @mentions, hashtags, and slang, shifting reliance toward domain-invariant sentiment markers. Our ablation study further confirms that attribution-guided masking is the critical component: removing it or replacing it with random token selection consistently degrades performance on difficult transfers.

View on arXiv PDF

Similar