NasoVoce: A Nose-Mounted Low-Audibility Speech Interface for Always-Available Speech Interaction
It addresses the problem of discreet, noise-robust voice interaction for users in noisy environments, though it appears incremental as it builds on existing silent speech methods.
The paper tackles the challenge of creating a low-audibility speech interface for always-available AI interaction by introducing NasoVoce, a nose-mounted device that fuses acoustic and vibration signals, resulting in improved speech recognition and quality metrics like PESQ and STOI.
Silent and whispered speech offer promise for always-available voice interaction with AI, yet existing methods struggle to balance vocabulary size, wearability, silence, and noise robustness. We present NasoVoce, a nose-bridge-mounted interface that integrates a microphone and a vibration sensor. Positioned at the nasal pads of smart glasses, it unobtrusively captures both acoustic and vibration signals. The nasal bridge, close to the mouth, allows access to bone- and skin-conducted speech and enables reliable capture of low-volume utterances such as whispered speech. While the microphone captures high-quality audio, it is highly sensitive to environmental noise. Conversely, the vibration sensor is robust to noise but yields lower signal quality. By fusing these complementary inputs, NasoVoce generates high-quality speech robust against interference. Evaluation with Whisper Large-v2, PESQ, STOI, and MUSHRA ratings confirms improved recognition and quality. NasoVoce demonstrates the feasibility of a practical interface for always-available, continuous, and discreet AI voice conversations.