Adversarial Unsupervised Domain Adaptation for Harmonic-Percussive Source Separation
This work addresses the challenge of adapting music source separation models to new musical domains for audio engineers and researchers, particularly when labeled data is scarce.
This paper tackles the problem of music source separation across different domains, specifically for harmonic-percussive separation. They propose an adversarial unsupervised domain adaptation approach that improves separation performance on a new domain using only unlabelled mixtures, without significant performance loss on the original domain.
This paper addresses the problem of domain adaptation for the task of music source separation. Using datasets from two different domains, we compare the performance of a deep learning-based harmonic-percussive source separation model under different training scenarios, including supervised joint training using data from both domains and pre-training in one domain with fine-tuning in another. We propose an adversarial unsupervised domain adaptation approach suitable for the case where no labelled data (ground-truth source signals) from a target domain is available. By leveraging unlabelled data (only mixtures) from this domain, experiments show that our framework can improve separation performance on the new domain without losing any considerable performance on the original domain. The paper also introduces the Tap & Fiddle dataset, a dataset containing recordings of Scandinavian fiddle tunes along with isolated tracks for 'foot-tapping' and 'violin'.