AS LG SDOct 12, 2021

The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction

Yashish M. Siriwardena, Guilhem Marion, Shihab Shamma

arXiv:2110.05695v41.2

Originality Incremental advance

AI Analysis

This work addresses the challenge of unsupervised control learning for audio synthesis, which could benefit fields like music production and robotics, though it is incremental as it builds on existing MirrorNet concepts.

The paper tackled the problem of learning audio synthesizer controls from auditory spectrograms using the MirrorNet architecture, inspired by sensorimotor interactions, and demonstrated that it could closely resemble original melodies, generalize to unseen melodies, and approximate complex piano renditions from a different synthesizer.

Experiments to understand the sensorimotor neural interactions in the human cortical speech system support the existence of a bidirectional flow of interactions between the auditory and motor regions. Their key function is to enable the brain to `learn' how to control the vocal tract for speech production. This idea is the impetus for the recently proposed "MirrorNet", a constrained autoencoder architecture. In this paper, the MirrorNet is applied to learn, in an unsupervised manner, the controls of a specific audio synthesizer (DIVA) to produce melodies only from their auditory spectrograms. The results demonstrate how the MirrorNet discovers the synthesizer parameters to generate the melodies that closely resemble the original and those of unseen melodies, and even determine the best set parameters to approximate renditions of complex piano melodies generated by a different synthesizer. This generalizability of the MirrorNet illustrates its potential to discover from sensory data the controls of arbitrary motor-plants.

View on arXiv PDF

Similar