KUIELab-MDX-Net: A Two-Stream Neural Network for Music Demixing
This work addresses the problem of high computational costs in music demixing for audio processing researchers, though it is incremental as it builds on existing deep learning methods.
The paper tackles music source separation by proposing KUIELab-MDX-Net, a two-stream neural network that balances performance and resource efficiency, achieving second and third places in the Music Demixing Challenge at ISMIR 2021.
Recently, many methods based on deep learning have been proposed for music source separation. Some state-of-the-art methods have shown that stacking many layers with many skip connections improve the SDR performance. Although such a deep and complex architecture shows outstanding performance, it usually requires numerous computing resources and time for training and evaluation. This paper proposes a two-stream neural network for music demixing, called KUIELab-MDX-Net, which shows a good balance of performance and required resources. The proposed model has a time-frequency branch and a time-domain branch, where each branch separates stems, respectively. It blends results from two streams to generate the final estimation. KUIELab-MDX-Net took second place on leaderboard A and third place on leaderboard B in the Music Demixing Challenge at ISMIR 2021. This paper also summarizes experimental results on another benchmark, MUSDB18. Our source code is available online.