Calibration of Encoder Decoder Models for Neural Machine Translation
This addresses reliability and inference issues in neural machine translation, which is incremental as it builds on existing encoder-decoder models.
The paper tackled the problem of miscalibration in state-of-the-art neural machine translation models, showing they are poorly calibrated even with true previous tokens, and proposed recalibration methods that improved accuracy, sequence-level calibration, and beam-search results.
We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.