Adapting Large Language Models for Character-based Augmentative and Alternative Communication
This work addresses the challenge of making AAC interfaces more efficient and accurate for users who write letter-by-letter, though it appears incremental as it adapts existing LLMs rather than introducing a fundamentally new approach.
The researchers tackled the problem of adapting large language models for character-based Augmentative and Alternative Communication (AAC) by developing an algorithm that produces character predictions from subword LLMs, achieving more accurate predictions than existing methods like classification layers, byte-level LLMs, or n-gram models.
Users of Augmentative and Alternative Communication (AAC) may write letter-by-letter via an interface that uses a character language model. However, most state-of-the-art large pretrained language models predict subword tokens of variable length. We investigate how to practically use such models to make accurate and efficient character predictions. Our algorithm for producing character predictions from a subword large language model (LLM) provides more accurate predictions than using a classification layer, a byte-level LLM, or an n-gram model. Additionally, we investigate a domain adaptation procedure based on a large dataset of sentences we curated based on scoring how useful each sentence might be for spoken or written AAC communication. We find our procedure further improves model performance on simple, conversational text.