A Better Variant of Self-Critical Sequence Training
arXiv:2003.09971v243 citations
AI Analysis
This is an incremental improvement for sequence generation tasks in machine learning.
The paper tackles the problem of improving Self-Critical Sequence Training by proposing a simple change in the baseline function of the REINFORCE algorithm, resulting in better performance with no extra cost compared to greedy decoding.
In this work, we present a simple yet better variant of Self-Critical Sequence Training. We make a simple change in the choice of baseline function in REINFORCE algorithm. The new baseline can bring better performance with no extra cost, compared to the greedy decoding baseline.