SDJun 22, 2017

A Wavenet for Speech Denoising

arXiv:1706.07162v3484 citations
Originality Highly original
AI Analysis

This addresses speech denoising for audio processing applications, offering a more efficient and effective alternative to traditional methods.

The paper tackled speech denoising by proposing an end-to-end Wavenet-based method that overcomes phase discarding in spectrogram techniques, resulting in computational and perceptual improvements over Wiener filtering.

Currently, most speech processing techniques use magnitude spectrograms as front-end and are therefore by default discarding part of the signal: the phase. In order to overcome this limitation, we propose an end-to-end learning method for speech denoising based on Wavenet. The proposed model adaptation retains Wavenet's powerful acoustic modeling capabilities, while significantly reducing its time-complexity by eliminating its autoregressive nature. Specifically, the model makes use of non-causal, dilated convolutions and predicts target fields instead of a single target sample. The discriminative adaptation of the model we propose, learns in a supervised fashion via minimizing a regression loss. These modifications make the model highly parallelizable during both training and inference. Both computational and perceptual evaluations indicate that the proposed method is preferred to Wiener filtering, a common method based on processing the magnitude spectrogram.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes