AS LG SD SPMay 15, 2019

A general-purpose deep learning approach to model time-varying audio effects

Marco A. Martínez Ramírez, Emmanouil Benetos, Joshua D. Reiss

arXiv:1905.06148v29.721 citations

Originality Incremental advance

AI Analysis

This work addresses the need for a generic modeling approach for time-varying audio effects, which is incremental as it builds on existing neural network methods for audio processing.

The authors tackled the problem of modeling time-varying audio effects with a general-purpose deep learning architecture, achieving a model that can handle various linear and nonlinear effects, as validated by a proposed psychoacoustic metric.

Audio processors whose parameters are modified periodically over time are often referred as time-varying or modulation based audio effects. Most existing methods for modeling these type of effect units are often optimized to a very specific circuit and cannot be efficiently generalized to other time-varying effects. Based on convolutional and recurrent neural networks, we propose a deep learning architecture for generic black-box modeling of audio processors with long-term memory. We explore the capabilities of deep neural networks to learn such long temporal dependencies and we show the network modeling various linear and nonlinear, time-varying and time-invariant audio effects. In order to measure the performance of the model, we propose an objective metric based on the psychoacoustics of modulation frequency perception. We also analyze what the model is actually learning and how the given task is accomplished.

View on arXiv PDF

Similar