ASSDMay 23, 2019

A Perceptual Weighting Filter Loss for DNN Training in Speech Enhancement

arXiv:1905.09754v33 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for speech enhancement systems, offering a simple loss function that can be applied without modifying DNN topology.

The paper tackled speech enhancement by proposing a perceptual weighting filter loss for DNN training, which improved performance over MSE loss in terms of perceptual quality and noise attenuation.

Single-channel speech enhancement with deep neural networks (DNNs) has shown promising performance and is thus intensively being studied. In this paper, instead of applying the mean squared error (MSE) as the loss function during DNN training for speech enhancement, we design a perceptual weighting filter loss motivated by the weighting filter as it is employed in analysis-by-synthesis speech coding, e.g., in code-excited linear prediction (CELP). The experimental results show that the proposed simple loss function improves the speech enhancement performance compared to a reference DNN with MSE loss in terms of perceptual quality and noise attenuation. The proposed loss function can be advantageously applied to an existing DNN-based speech enhancement system, without modification of the DNN topology for speech enhancement. The source code for the proposed approach is made available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes