SDLGASOct 29, 2018

End-to-end music source separation: is it possible in the waveform domain?

arXiv:1810.12187v277 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of improving music source separation for audio processing applications by moving beyond spectrogram-based methods, though it is incremental as it builds on existing deep learning approaches.

The study tackled the problem of music source separation by exploring end-to-end models in the waveform domain to utilize all audio information, including phase, and found that waveform-based models like Wavenet and Wave-U-Net can perform similarly or better than spectrogram-based models such as DeepConvSep.

Most of the currently successful source separation techniques use the magnitude spectrogram as input, and are therefore by default omitting part of the signal: the phase. To avoid omitting potentially useful information, we study the viability of using end-to-end models for music source separation --- which take into account all the information available in the raw audio signal, including the phase. Although during the last decades end-to-end music source separation has been considered almost unattainable, our results confirm that waveform-based models can perform similarly (if not better) than a spectrogram-based deep learning model. Namely: a Wavenet-based model we propose and Wave-U-Net can outperform DeepConvSep, a recent spectrogram-based deep learning model.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes