SDLGASMLOct 19, 2020

MicAugment: One-shot Microphone Style Transfer

arXiv:2010.09658v15 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues for audio-based models in real-world deployment, but it is incremental as it builds on existing style transfer and data augmentation techniques.

The paper tackles the problem of audio model robustness to different microphone conditions by proposing a one-shot microphone style transfer method, which synthesizes audio as if recorded by a target device and significantly improves model robustness in downstream tasks.

A crucial aspect for the successful deployment of audio-based models "in-the-wild" is the robustness to the transformations introduced by heterogeneous acquisition conditions. In this work, we propose a method to perform one-shot microphone style transfer. Given only a few seconds of audio recorded by a target device, MicAugment identifies the transformations associated to the input acquisition pipeline and uses the learned transformations to synthesize audio as if it were recorded under the same conditions as the target audio. We show that our method can successfully apply the style transfer to real audio and that it significantly increases model robustness when used as data augmentation in the downstream tasks.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes