ASSDNov 20, 2019

Perceptual Loss Function for Neural Modelling of Audio Systems

arXiv:1911.08922v136 citations
Originality Synthesis-oriented
AI Analysis

This work addresses audio quality enhancement for neural network models in audio processing, but it is incremental as it builds on previous methods with specific filter adjustments.

The authors tackled the problem of improving neural network models for nonlinear audio processing by exploring perceptually relevant pre-emphasis filters in the loss function, finding that an A-weighting filter offered the best improvement in sound quality based on listening tests without increasing computational cost.

This work investigates alternate pre-emphasis filters used as part of the loss function during neural network training for nonlinear audio processing. In our previous work, the error-to-signal ratio loss function was used during network training, with a first-order highpass pre-emphasis filter applied to both the target signal and neural network output. This work considers more perceptually relevant pre-emphasis filters, which include lowpass filtering at high frequencies. We conducted listening tests to determine whether they offer an improvement to the quality of a neural network model of a guitar tube amplifier. Listening test results indicate that the use of an A-weighting pre-emphasis filter offers the best improvement among the tested filters. The proposed perceptual loss function improves the sound quality of neural network models in audio processing without affecting the computational cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes