CLAIMay 5, 2023

Now It Sounds Like You: Learning Personalized Vocabulary On Device

arXiv:2305.03584v35 citations
Originality Incremental advance
AI Analysis

This addresses the issue of handling user-specific OOV words in on-device language models for personalized NLP applications, representing an incremental improvement.

The paper tackles the problem of out-of-vocabulary (OOV) words in on-device language models by proposing OOV expansion, which improves OOV coverage and increases model accuracy while minimizing memory and latency impacts, outperforming standard federated learning personalization methods on benchmarks.

In recent years, Federated Learning (FL) has shown significant advancements in its ability to perform various natural language processing (NLP) tasks. This work focuses on applying personalized FL for on-device language modeling. Due to limitations of memory and latency, these models cannot support the complexity of sub-word tokenization or beam search decoding, resulting in the decision to deploy a closed-vocabulary language model. However, closed-vocabulary models are unable to handle out-of-vocabulary (OOV) words belonging to specific users. To address this issue, We propose a novel technique called "OOV expansion" that improves OOV coverage and increases model accuracy while minimizing the impact on memory and latency. This method introduces a personalized "OOV adapter" that effectively transfers knowledge from a central model and learns word embedding for personalized vocabulary. OOV expansion significantly outperforms standard FL personalization methods on a set of common FL benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes