CLApr 3, 2024

Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages

arXiv:2404.02588v14 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of scaling voice assistants to new languages efficiently, offering a slot-type independent pipeline that does not require changes to production architecture, though it is incremental as it builds on existing LLM and SLU methods.

The paper tackles the problem of extending spoken language understanding (SLU) systems to new languages by using fine-tuned large language models (LLMs) for machine translation of training data, resulting in improvements in overall accuracy from 53% to 62.18% on the MultiATIS++ benchmark in cloud scenarios and from 5.31% to 22.06% in on-device scenarios.

Spoken Language Understanding (SLU) models are a core component of voice assistants (VA), such as Alexa, Bixby, and Google Assistant. In this paper, we introduce a pipeline designed to extend SLU systems to new languages, utilizing Large Language Models (LLMs) that we fine-tune for machine translation of slot-annotated SLU training data. Our approach improved on the MultiATIS++ benchmark, a primary multi-language SLU dataset, in the cloud scenario using an mBERT model. Specifically, we saw an improvement in the Overall Accuracy metric: from 53% to 62.18%, compared to the existing state-of-the-art method, Fine and Coarse-grained Multi-Task Learning Framework (FC-MTLF). In the on-device scenario (tiny and not pretrained SLU), our method improved the Overall Accuracy from 5.31% to 22.06% over the baseline Global-Local Contrastive Learning Framework (GL-CLeF) method. Contrary to both FC-MTLF and GL-CLeF, our LLM-based machine translation does not require changes in the production architecture of SLU. Additionally, our pipeline is slot-type independent: it does not require any slot definitions or examples.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes