Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak
This work addresses speech recognition for Slovak speakers by leveraging Czech data, representing an incremental improvement through transfer learning.
The paper tackled the problem of training Slovak speech recognition models by using transfer learning from a Czech pre-trained Wav2Vec 2.0 model, achieving the best results when initializing weights from the Czech model during pre-training and outperforming larger multilingual models on three Slovak datasets.
In this paper, we are comparing several methods of training the Slovak speech recognition models based on the Transformers architecture. Specifically, we are exploring the approach of transfer learning from the existing Czech pre-trained Wav2Vec 2.0 model into Slovak. We are demonstrating the benefits of the proposed approach on three Slovak datasets. Our Slovak models scored the best results when initializing the weights from the Czech model at the beginning of the pre-training phase. Our results show that the knowledge stored in the Cezch pre-trained model can be successfully reused to solve tasks in Slovak while outperforming even much larger public multilingual models.