Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data
This addresses the challenge of noisy data for researchers and practitioners in music source separation, though it is incremental as it builds on existing label refinement methods.
The paper tackles the problem of mislabeled instrument tracks in music source separation datasets by introducing an automated self-refining technique for noisy labels, resulting in only a 1% accuracy degradation compared to using clean labels and even outperforming methods that rely on clean-label classifiers.
Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks. With the push to acquire larger datasets to improve MSS performance, the inevitability of encountering mislabeled individual instrument tracks becomes a significant challenge to address. This paper introduces an automated technique for refining the labels in a partially mislabeled dataset. Our proposed self-refining technique, employed with a noisy-labeled dataset, results in only a 1% accuracy degradation in multi-label instrument recognition compared to a classifier trained on a clean-labeled dataset. The study demonstrates the importance of refining noisy-labeled data in MSS model training and shows that utilizing the refined dataset leads to comparable results derived from a clean-labeled dataset. Notably, upon only access to a noisy dataset, MSS models trained on a self-refined dataset even outperform those trained on a dataset refined with a classifier trained on clean labels.