CLMar 27, 2024

Towards a World-English Language Model for On-Device Virtual Assistants

Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil

arXiv:2403.18783v11.0h-index: 14ICASSP

Originality Incremental advance

AI Analysis

This work addresses scalability issues for on-device virtual assistants by enabling multi-dialect support without performance degradation, though it is incremental as it builds on existing production models.

The paper tackled the problem of scaling and maintaining neural network language models for virtual assistants by combining regional English variants into a single 'World English' model, achieving comparable accuracy, latency, and memory constraints to single-dialect models.

Neural Network Language Models (NNLMs) for Virtual Assistants (VAs) are generally language-, region-, and in some cases, device-dependent, which increases the effort to scale and maintain them. Combining NNLMs for one or more of the categories is one way to improve scalability. In this work, we combine regional variants of English to build a ``World English'' NNLM for on-device VAs. In particular, we investigate the application of adapter bottlenecks to model dialect-specific characteristics in our existing production NNLMs {and enhance the multi-dialect baselines}. We find that adapter modules are more effective in modeling dialects than specializing entire sub-networks. Based on this insight and leveraging the design of our production models, we introduce a new architecture for World English NNLM that meets the accuracy, latency, and memory constraints of our single-dialect models.

View on arXiv PDF

Similar