SDLGNEMLOct 30, 2017

Generative Adversarial Source Separation

arXiv:1710.10779v176 citations
Originality Incremental advance
AI Analysis

This addresses the problem of improving source separation accuracy for audio processing applications, but it is incremental as it adapts existing GAN methods to a specific task.

The paper tackled speech source separation by proposing a multi-layer perceptron trained with a Wasserstein-GAN formulation, which outperformed NMF, auto-encoders, and variational auto-encoders with a source to distortion ratio improvement.

Generative source separation methods such as non-negative matrix factorization (NMF) or auto-encoders, rely on the assumption of an output probability density. Generative Adversarial Networks (GANs) can learn data distributions without needing a parametric assumption on the output density. We show on a speech source separation experiment that, a multi-layer perceptron trained with a Wasserstein-GAN formulation outperforms NMF, auto-encoders trained with maximum likelihood, and variational auto-encoders in terms of source to distortion ratio.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes