Can MusicGen Create Training Data for MIR Tasks?
This addresses data scarcity issues for researchers in MIR, though it is incremental as an initial experiment in a broader concept.
The paper tackled the problem of generating training data for Music Information Retrieval (MIR) tasks by using the AI-based generative system MusicGen to create artificial music, and the result showed that a genre classifier trained on over 50,000 generated excerpts across five genres generalized well to real-world recordings.
We are investigating the broader concept of using AI-based generative music systems to generate training data for Music Information Retrieval (MIR) tasks. To kick off this line of work, we ran an initial experiment in which we trained a genre classifier on a fully artificial music dataset created with MusicGen. We constructed over 50 000 genre- conditioned textual descriptions and generated a collection of music excerpts that covers five musical genres. Our preliminary results show that the proposed model can learn genre-specific characteristics from artificial music tracks that generalise well to real-world music recordings.