Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication
This work addresses the need for more expressive communication tools for AAC users, though it appears incremental in applying existing technologies like ASR and LLMs to this domain.
The paper tackled the problem of limited expressivity in augmentative and alternative communication (AAC) systems by developing Speak Ease, which integrates multimodal inputs and LLMs to enable more personalized and contextually relevant communication, as validated through a feasibility study with speech and language pathologists.
In this paper, we present Speak Ease: an augmentative and alternative communication (AAC) system to support users' expressivity by integrating multimodal input, including text, voice, and contextual cues (conversational partner and emotional tone), with large language models (LLMs). Speak Ease combines automatic speech recognition (ASR), context-aware LLM-based outputs, and personalized text-to-speech technologies to enable more personalized, natural-sounding, and expressive communication. Through an exploratory feasibility study and focus group evaluation with speech and language pathologists (SLPs), we assessed Speak Ease's potential to enable expressivity in AAC. The findings highlight the priorities and needs of AAC users and the system's ability to enhance user expressivity by supporting more personalized and contextually relevant communication. This work provides insights into the use of multimodal inputs and LLM-driven features to improve AAC systems and support expressivity.