ASLGSDDec 3, 2020

Text-to-speech for the hearing impaired

arXiv:2012.02174v24 citations
AI Analysis

This work addresses the critical problem of improving speech intelligibility for individuals with hearing impairment by generating personalized amplified speech, offering a potential alternative to traditional hearing aids.

This paper tackles the problem of generating personalized amplified speech for the hearing impaired using a text-to-speech (TTS) system. The proposed algorithm, embedded in a Tacotron2 and WaveGlow TTS framework, restores loudness to normal perception with high resolution. Subjective evaluations showed that the system produced high-quality audio with similar sound quality to original or linearly amplified speech, but with considerably higher speech intelligibility in noise.

Text-to-speech (TTS) systems offer the opportunity to compensate for a hearing loss at the source rather than correcting for it at the receiving end. This removes limitations such as time constraints for algorithms that amplify a sound in a hearing aid and can lead to higher speech quality. We propose an algorithm that restores loudness to normal perception at a high resolution in time, frequency and level, and embed it in a TTS system that uses Tacotron2 and WaveGlow to produce individually amplified speech. Subjective evaluations of speech quality showed that the proposed algorithm led to high-quality audio with sound quality similar to original or linearly amplified speech but considerably higher speech intelligibility in noise. Transfer learning led to a quick adaptation of the produced spectra from original speech to individually amplified speech, resulted in high speech quality and intelligibility, and thus gives us a way to train an individual TTS system efficiently.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes