OpenMU: Your Swiss Army Knife for Music Understanding
This addresses the problem of limited data for music understanding models, benefiting researchers and creative music production, though it is incremental as it builds on existing datasets and annotations.
The authors tackled data scarcity in training multimodal language models for music understanding by creating OpenMU-Bench, a large-scale benchmark suite, and trained OpenMU, which outperforms baseline models like MU-Llama.
We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. To construct OpenMU-Bench, we leveraged existing datasets and bootstrapped new annotations. OpenMU-Bench also broadens the scope of music understanding by including lyrics understanding and music tool usage. Using OpenMU-Bench, we trained our music understanding model, OpenMU, with extensive ablations, demonstrating that OpenMU outperforms baseline models such as MU-Llama. Both OpenMU and OpenMU-Bench are open-sourced to facilitate future research in music understanding and to enhance creative music production efficiency.