CLCVLGOct 20, 2020

Towards End-to-End In-Image Neural Machine Translation

arXiv:2010.10648v1996 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for automated translation of text embedded in images, but it is incremental as it builds on existing neural machine translation approaches.

The paper tackles the problem of translating text within images from one language to another using an end-to-end neural model, demonstrating promising initial results with pixel-level supervision.

In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language. We propose an end-to-end neural model for this task inspired by recent approaches to neural machine translation, and demonstrate promising initial results based purely on pixel-level supervision. We then offer a quantitative and qualitative evaluation of our system outputs and discuss some common failure modes. Finally, we conclude with directions for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes