Neural Sequence Model Training via $α$-divergence Minimization
This work addresses training challenges for neural sequence models, such as in machine translation, but appears incremental as it builds on existing divergence frameworks.
The authors tackled the problem of training neural sequence models by proposing a new objective function based on α-divergence, which generalizes maximum-likelihood and reinforcement learning approaches, and demonstrated that using α > 0 outperforms maximum-likelihood methods in machine translation tasks.
We propose a new neural sequence model training method in which the objective function is defined by $α$-divergence. We demonstrate that the objective function generalizes the maximum-likelihood (ML)-based and reinforcement learning (RL)-based objective functions as special cases (i.e., ML corresponds to $α\to 0$ and RL to $α\to1$). We also show that the gradient of the objective function can be considered a mixture of ML- and RL-based objective gradients. The experimental results of a machine translation task show that minimizing the objective function with $α> 0$ outperforms $α\to 0$, which corresponds to ML-based methods.