ASLGSDSPMay 15, 2019

A general-purpose deep learning approach to model time-varying audio effects

arXiv:1905.06148v221 citations
AI Analysis

This work addresses the need for a generic modeling approach for time-varying audio effects, which is incremental as it builds on existing neural network methods for audio processing.

The authors tackled the problem of modeling time-varying audio effects with a general-purpose deep learning architecture, achieving a model that can handle various linear and nonlinear effects, as validated by a proposed psychoacoustic metric.

Audio processors whose parameters are modified periodically over time are often referred as time-varying or modulation based audio effects. Most existing methods for modeling these type of effect units are often optimized to a very specific circuit and cannot be efficiently generalized to other time-varying effects. Based on convolutional and recurrent neural networks, we propose a deep learning architecture for generic black-box modeling of audio processors with long-term memory. We explore the capabilities of deep neural networks to learn such long temporal dependencies and we show the network modeling various linear and nonlinear, time-varying and time-invariant audio effects. In order to measure the performance of the model, we propose an objective metric based on the psychoacoustics of modulation frequency perception. We also analyze what the model is actually learning and how the given task is accomplished.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes