ASAISDSPSep 27, 2024

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds

arXiv:2409.18705v12 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of on-device speech enhancement for TWS earbud users in noisy settings, though it appears incremental as it builds on existing models with optimizations.

The paper tackled the problem of enabling low-latency speech enhancement for true wireless stereo earbuds in noisy environments, achieving substantial improvements in quality while reducing computational complexity and latency to under 3 ms.

This paper introduces a speech enhancement solution tailored for true wireless stereo (TWS) earbuds on-device usage. The solution was specifically designed to support conversations in noisy environments, with active noise cancellation (ANC) activated. The primary challenges for speech enhancement models in this context arise from computational complexity that limits on-device usage and latency that must be less than 3 ms to preserve a live conversation. To address these issues, we evaluated several crucial design elements, including the network architecture and domain, design of loss functions, pruning method, and hardware-specific optimization. Consequently, we demonstrated substantial improvements in speech enhancement quality compared with that in baseline models, while simultaneously reducing the computational complexity and algorithmic latency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes