CLSep 16, 2021

Translation Transformers Rediscover Inherent Data Domains

arXiv:2109.07864v1651 citations
Originality Incremental advance
AI Analysis

This addresses the problem of domain adaptation in machine translation by revealing inherent domain representations, offering a more efficient approach without external models.

The study analyzed how Neural Machine Translation (NMT) Transformers inherently represent text domain information without domain labels, finding that they cluster sentences by domains with near 100% accuracy at the document level, and used this to improve domain adaptation by reusing the NMT model for clustering, outperforming pre-trained language models in experiments.

Many works proposed methods to improve the performance of Neural Machine Translation (NMT) models in a domain/multi-domain adaptation scenario. However, an understanding of how NMT baselines represent text domain information internally is still lacking. Here we analyze the sentence representations learned by NMT Transformers and show that these explicitly include the information on text domains, even after only seeing the input sentences without domains labels. Furthermore, we show that this internal information is enough to cluster sentences by their underlying domains without supervision. We show that NMT models produce clusters better aligned to the actual domains compared to pre-trained language models (LMs). Notably, when computed on document-level, NMT cluster-to-domain correspondence nears 100%. We use these findings together with an approach to NMT domain adaptation using automatically extracted domains. Whereas previous work relied on external LMs for text clustering, we propose re-using the NMT model as a source of unsupervised clusters. We perform an extensive experimental study comparing two approaches across two data scenarios, three language pairs, and both sentence-level and document-level clustering, showing equal or significantly superior performance compared to LMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes