CLAIMay 26, 2023

TranSFormer: Slow-Fast Transformer for Machine Translation

arXiv:2305.16982v1225 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for machine translation systems by enhancing multiscale Transformers with character-level features.

The paper tackles the problem of incorporating fine-grained character-level features into multiscale Transformer models for machine translation, resulting in consistent BLEU improvements of over 1 point on several benchmarks.

Learning multiscale Transformer models has been evidenced as a viable approach to augmenting machine translation systems. Prior research has primarily focused on treating subwords as basic units in developing such systems. However, the incorporation of fine-grained character-level features into multiscale Transformer has not yet been explored. In this work, we present a \textbf{S}low-\textbf{F}ast two-stream learning model, referred to as Tran\textbf{SF}ormer, which utilizes a ``slow'' branch to deal with subword sequences and a ``fast'' branch to deal with longer character sequences. This model is efficient since the fast branch is very lightweight by reducing the model width, and yet provides useful fine-grained features for the slow branch. Our TranSFormer shows consistent BLEU improvements (larger than 1 BLEU point) on several machine translation benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes