Music Source Restoration
This addresses the gap between idealized source separation and practical music production for audio researchers and engineers, though it is incremental as it builds on existing separation methods by adding degradation modeling.
The paper tackles the problem of restoring original music sources from degraded mixtures in real-world music production, introducing the Music Source Restoration (MSR) task and the RawStems dataset with 578 songs and 354.13 hours of unprocessed source signals, and demonstrates feasibility with a baseline method.
We introduce Music Source Restoration (MSR), a novel task addressing the gap between idealized source separation and real-world music production. Current Music Source Separation (MSS) approaches assume mixtures are simple sums of sources, ignoring signal degradations employed during music production like equalization, compression, and reverb. MSR models mixtures as degraded sums of individually degraded sources, with the goal of recovering original, undegraded signals. Due to the lack of data for MSR, we present RawStems, a dataset annotation of 578 songs with unprocessed source signals organized into 8 primary and 17 secondary instrument groups, totaling 354.13 hours. To the best of our knowledge, RawStems is the first dataset that contains unprocessed music stems with hierarchical categories. We consider spectral filtering, dynamic range compression, harmonic distortion, reverb and lossy codec as possible degradations, and establish U-Former as a baseline method, demonstrating the feasibility of MSR on our dataset. We release the RawStems dataset annotations, degradation simulation pipeline, training code and pre-trained models to be publicly available.