Effects of Layer Freezing on Transferring a Speech Recognition System to Under-resourced Languages
This research provides insights into effective transfer learning strategies for automatic speech recognition in under-resourced languages, benefiting researchers and developers working on low-resource ASR.
This paper investigates the impact of layer freezing on transferring speech recognition models to under-resourced languages. They found that freezing even a single layer significantly improved results when transferring a pre-trained DeepSpeech model to German and Swiss German datasets compared to training from scratch.
In this paper, we investigate the effect of layer freezing on the effectiveness of model transfer in the area of automatic speech recognition. We experiment with Mozilla's DeepSpeech architecture on German and Swiss German speech datasets and compare the results of either training from scratch vs. transferring a pre-trained model. We compare different layer freezing schemes and find that even freezing only one layer already significantly improves results.