AS LG SDSep 17, 2024

SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Jaime Garcia-Martinez, David Diaz-Guerra, Archontis Politis, Tuomas Virtanen, Julio J. Carabias-Orti, Pedro Vera-Candeas

arXiv:2409.10995v24.33 citationsh-index: 25Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the scarcity of clean multitrack datasets for orchestra music separation, which is a domain-specific problem for audio processing researchers, but the approach is incremental as it builds on existing methods with new data.

The paper tackles the problem of source separation for orchestra music by introducing SynthSOD, a novel multitrack dataset created using simulation techniques, and demonstrates its use by training a baseline model that shows competitive performance on synthetic and real-world evaluations.

Recent advancements in music source separation have significantly progressed, particularly in isolating vocals, drums, and bass elements from mixed tracks. These developments owe much to the creation and use of large-scale, multitrack datasets dedicated to these specific components. However, the challenge of extracting similarly sounding sources from orchestra recordings has not been extensively explored, largely due to a scarcity of comprehensive and clean (i.e bleed-free) multitrack datasets. In this paper, we introduce a novel multitrack dataset called SynthSOD, developed using a set of simulation techniques to create a realistic (i.e. using high-quality soundfonts), musically motivated, and heterogeneous training set comprising different dynamics, natural tempo changes, styles, and conditions. Moreover, we demonstrate the application of a widely used baseline music separation model trained on our synthesized dataset w.r.t to the well-known EnsembleSet, and evaluate its performance under both synthetic and real-world conditions.

View on arXiv PDF Code

Similar