A simple discriminative training method for machine translation with large-scale features
This addresses implementation complexity for researchers and practitioners using large-scale features in statistical machine translation, but it is incremental as it matches rather than surpasses existing methods.
The authors tackled the complexity of implementing margin infused relaxed algorithms (MIRAs) for machine translation with large-scale features by introducing a new method that treats an N-best list as a permutation and minimizes Plackett-Luce loss. The result showed the method is more robust than MERT and matches MIRAs while being easier to implement.
Margin infused relaxed algorithms (MIRAs) dominate model tuning in statistical machine translation in the case of large scale features, but also they are famous for the complexity in implementation. We introduce a new method, which regards an N-best list as a permutation and minimizes the Plackett-Luce loss of ground-truth permutations. Experiments with large-scale features demonstrate that, the new method is more robust than MERT; though it is only matchable with MIRAs, it has a comparatively advantage, easier to implement.