YM2413-MDB: A Multi-Instrumental FM Video Game Music Dataset with Emotion Annotations
This dataset addresses a gap for researchers in music information retrieval and AI music generation by focusing on video game music with emotion tags, though it is incremental as it builds on existing dataset efforts.
The authors tackled the lack of diverse and annotated multi-instrumental music datasets by creating YM2413-MDB, an 80s FM video game music dataset with 669 audio and MIDI files and multi-label emotion annotations, providing baseline models for emotion recognition and generation tasks.
Existing multi-instrumental datasets tend to be biased toward pop and classical music. In addition, they generally lack high-level annotations such as emotion tags. In this paper, we propose YM2413-MDB, an 80s FM video game music dataset with multi-label emotion annotations. It includes 669 audio and MIDI files of music from Sega and MSX PC games in the 80s using YM2413, a programmable sound generator based on FM. The collected game music is arranged with a subset of 15 monophonic instruments and one drum instrument. They were converted from binary commands of the YM2413 sound chip. Each song was labeled with 19 emotion tags by two annotators and validated by three verifiers to obtain refined tags. We provide the baseline models and results for emotion recognition and emotion-conditioned symbolic music generation using YM2413-MDB.