Adaptable Multi-Domain Language Model for Transformer ASR
This work addresses the challenge of efficiently adapting ASR systems to multiple domains, reducing maintenance costs and parameter overhead, though it is incremental in its approach.
The authors tackled the problem of multi-domain adaptation for Transformer ASR by proposing an adapter-based language model, which reduces parameters by about 2% for the first domain and 13% for subsequent domains while outperforming a dedicated music domain LM in word error rate.
We propose an adapter based multi-domain Transformer based language model (LM) for Transformer ASR. The model consists of a big size common LM and small size adapters. The model can perform multi-domain adaptation with only the small size adapters and its related layers. The proposed model can reuse the full fine-tuned LM which is fine-tuned using all layers of an original model. The proposed LM can be expanded to new domains by adding about 2% of parameters for a first domain and 13% parameters for after second domain. The proposed model is also effective in reducing the model maintenance cost because it is possible to omit the costly and time-consuming common LM pre-training process. Using proposed adapter based approach, we observed that a general LM with adapter can outperform a dedicated music domain LM in terms of word error rate (WER).