CLMay 16, 2018

What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training

arXiv:1805.06088v11113 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of domain robustness in NLP for applications dealing with diverse data sources, representing an incremental advance over prior domain-adversarial methods.

The paper tackles the problem of learning robust text representations that generalize across heterogeneous domains by proposing a method combining structured neural components with adversarial training, achieving substantial improvements in multi-domain language identification and sentiment analysis over existing domain adaptation techniques.

Most real world language problems require learning from heterogenous corpora, raising the problem of learning robust models which generalise well to both similar (in domain) and dissimilar (out of domain) instances to those seen in training. This requires learning an underlying task, while not learning irrelevant signals and biases specific to individual domains. We propose a novel method to optimise both in- and out-of-domain accuracy based on joint learning of a structured neural model with domain-specific and domain-general components, coupled with adversarial training for domain. Evaluating on multi-domain language identification and multi-domain sentiment analysis, we show substantial improvements over standard domain adaptation techniques, and domain-adversarial training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes