From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction
This work addresses sequence prediction challenges in machine learning, offering incremental improvements over prior approaches.
The paper tackled the credit assignment problem in reward augmented maximum likelihood learning by establishing a theoretical equivalence with entropy regularized reinforcement learning, resulting in two new algorithms that outperform existing methods on benchmark datasets.
In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction.