CLLGMLOct 21, 2014

Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

arXiv:1410.5877v172 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficiently scaling machine translation systems for applications requiring high-quality translations, though it appears incremental as it builds on existing active learning methods.

The paper tackles the problem of diminishing returns when adding translation data to already resource-rich machine translation systems, presenting an active learning-style data solicitation algorithm that achieves an order of magnitude increase in performance improvement rates.

We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes