LG CL MLMar 3, 2019

Calibration of Encoder Decoder Models for Neural Machine Translation

arXiv:1903.00802v124.9111 citations

Originality Incremental advance

AI Analysis

This addresses reliability and inference issues in neural machine translation, which is incremental as it builds on existing encoder-decoder models.

The paper tackled the problem of miscalibration in state-of-the-art neural machine translation models, showing they are poorly calibrated even with true previous tokens, and proposed recalibration methods that improved accuracy, sequence-level calibration, and beam-search results.

We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.

View on arXiv PDF

Similar