ASLGSDMay 24, 2019

Fast computation of loudness using a deep neural network

arXiv:1905.10399v12 citations
Originality Incremental advance
AI Analysis

This enables real-time loudness calculation for audio processing applications, though it is incremental as it speeds up an existing perceptual model.

The paper tackled the problem of slow loudness computation by introducing a deep neural network (DNN) that predicts instantaneous loudness from sound waveforms, achieving over 100,000 computations per second compared to a few hundred with the existing Cambridge model, with a root-mean-square deviation of less than 0.5 phon.

The present paper introduces a deep neural network (DNN) for predicting the instantaneous loudness of a sound from its time waveform. The DNN was trained using the output of a more complex model, called the Cambridge loudness model. While a modern PC can perform a few hundred loudness computations per second using the Cambridge loudness model, it can perform more than 100,000 per second using the DNN, allowing real-time calculation of loudness. The root-mean-square deviation between the predictions of instantaneous loudness level using the two models was less than 0.5 phon for unseen types of sound. We think that the general approach of simulating a complex perceptual model by a much faster DNN can be applied to other perceptual models to make them run in real time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes