SDASMLMay 11, 2020

GACELA -- A generative adversarial context encoder for long audio inpainting

arXiv:2005.05032v154 citations
AI Analysis

This addresses audio inpainting for music signals with long gaps, offering a framework for future improvements, though it is incremental as it builds on existing GAN methods.

The paper tackled the problem of restoring missing musical audio data for long gaps (hundreds of milliseconds to seconds) using GACELA, a generative adversarial network, and found that in listening tests, the severity of artifacts decreased from unacceptable to mildly disturbing across gap durations of 375 ms to 1500 ms.

We introduce GACELA, a generative adversarial network (GAN) designed to restore missing musical audio data with a duration ranging between hundreds of milliseconds to a few seconds, i.e., to perform long-gap audio inpainting. While previous work either addressed shorter gaps or relied on exemplars by copying available information from other signal parts, GACELA addresses the inpainting of long gaps in two aspects. First, it considers various time scales of audio information by relying on five parallel discriminators with increasing resolution of receptive fields. Second, it is conditioned not only on the available information surrounding the gap, i.e., the context, but also on the latent variable of the conditional GAN. This addresses the inherent multi-modality of audio inpainting at such long gaps and provides the option of user-defined inpainting. GACELA was tested in listening tests on music signals of varying complexity and gap durations ranging from 375~ms to 1500~ms. While our subjects were often able to detect the inpaintings, the severity of the artifacts decreased from unacceptable to mildly disturbing. GACELA represents a framework capable to integrate future improvements such as processing of more auditory-related features or more explicit musical features.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes