CLNov 20, 2017

Fast BTG-Forest-Based Hierarchical Sub-sentential Alignment

arXiv:1711.07265v1
Originality Incremental advance
AI Analysis

This addresses alignment efficiency and quality for statistical machine translation, particularly for distant language pairs, though it appears incremental as it builds on existing BTG and fast_align methods.

The paper tackles hierarchical sub-sentential alignment for machine translation by proposing a BTG-forest-based method with fast unsupervised initialization, achieving comparable translation performance and run-time to fast_align while producing smaller phrase tables and outperforming in distantly related languages like English-Japanese.

In this paper, we propose a novel BTG-forest-based alignment method. Based on a fast unsupervised initialization of parameters using variational IBM models, we synchronously parse parallel sentences top-down and align hierarchically under the constraint of BTG. Our two-step method can achieve the same run-time and comparable translation performance as fast_align while it yields smaller phrase tables. Final SMT results show that our method even outperforms in the experiment of distantly related languages, e.g., English-Japanese.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes