Tied Multitask Learning for Neural Speech Translation
This work addresses low-resource speech translation, which is incremental as it builds on existing multitask models with specific enhancements.
The authors tackled the problem of low-resource speech transcription and translation by introducing a multitask model with decoder information sharing and regularization for transitivity and invertibility, resulting in improved performance on these tasks and better word discovery from unsegmented input.
We explore multitask models for neural translation of speech, augmenting them in order to reflect two intuitive notions. First, we introduce a model where the second task decoder receives information from the decoder of the first task, since higher-level intermediate representations should provide useful information. Second, we apply regularization that encourages transitivity and invertibility. We show that the application of these notions on jointly trained models improves performance on the tasks of low-resource speech transcription and translation. It also leads to better performance when using attention information for word discovery over unsegmented input.