AS SDNov 7, 2018

High-quality speech coding with SampleRNN

Janusz Klejsa, Per Hedelin, Cong Zhou, Roy Fejgin, Lars Villemoes

arXiv:1811.03021v114.563 citations

Originality Highly original

AI Analysis

This addresses speech compression for applications requiring efficient transmission or storage, offering a novel generative approach with competitive quality.

The researchers tackled speech coding by developing a generative model based on SampleRNN that operates at significantly lower bitrates while matching or surpassing the perceptual quality of state-of-the-art classic wide-band codecs, as validated through listening tests.

We provide a speech coding scheme employing a generative model based on SampleRNN that, while operating at significantly lower bitrates, matches or surpasses the perceptual quality of state-of-the-art classic wide-band codecs. Moreover, it is demonstrated that the proposed scheme can provide a meaningful rate-distortion trade-off without retraining. We evaluate the proposed scheme in a series of listening tests and discuss limitations of the approach.

View on arXiv PDF

Similar