ASLGSDSep 19, 2019

WEnets: A Convolutional Framework for Evaluating Audio Waveforms

arXiv:1909.09024v14.36 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses audio quality assessment for applications like telecommunications, but it is incremental as it adapts existing convolutional methods to a specific domain.

The authors tackled the problem of evaluating audio waveforms by introducing WEnets, a convolutional framework, and developed NAWEnet, a single-ended network that emulates PESQ, POLQA, and STOI with testing correlations of 0.95, 0.92, and 0.95, respectively, using only 50% of data for training.

We describe a new convolutional framework for waveform evaluation, WEnets, and build a Narrowband Audio Waveform Evaluation Network, or NAWEnet, using this framework. NAWEnet is single-ended (or no-reference) and was trained three separate times in order to emulate PESQ, POLQA, or STOI with testing correlations 0.95, 0.92, and 0.95, respectively when training on only 50% of available data and testing on 40%. Stacks of 1-D convolutional layers and non-linear downsampling learn which features are important for quality or intelligibility estimation. This straightforward architecture simplifies the interpretation of its inner workings and paves the way for future investigations into higher sample rates and accurate no-reference subjective speech quality predictions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes