CLApr 16, 2021

Robust Open-Vocabulary Translation from Visual Text Representations

arXiv:2104.08211v3671 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues in machine translation for noisy text inputs, though it appears incremental as it adapts existing methods to a new representation.

The paper tackles the problem of machine translation models being vulnerable to noise due to discrete vocabularies and subword segmentation, by proposing visual text representations that use continuous vocabularies from rendered text. The result shows these models match traditional performance and achieve significant robustness, e.g., 25.9 BLEU vs. 1.9 BLEU on a character-permuted task.

Machine translation models have discrete vocabularies and commonly use subword segmentation techniques to achieve an 'open vocabulary.' This approach relies on consistent and correct underlying unicode sequences, and makes models susceptible to degradation from common types of noise and variation. Motivated by the robustness of human language processing, we propose the use of visual text representations, which dispense with a finite set of text embeddings in favor of continuous vocabularies created by processing visually rendered text with sliding windows. We show that models using visual text representations approach or match performance of traditional text models on small and larger datasets. More importantly, models with visual embeddings demonstrate significant robustness to varied types of noise, achieving e.g., 25.9 BLEU on a character permuted German-English task where subword models degrade to 1.9.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes