Symbolic music generation conditioned on continuous-valued emotions
This addresses the challenge of emotion-driven music generation for applications in creative AI and entertainment, representing an incremental improvement over existing conditioning techniques.
The paper tackles the problem of generating multi-instrument symbolic music conditioned on continuous-valued emotions (valence and arousal), and the result shows that their approach outperforms current state-of-the-art methods using control tokens in terms of note prediction accuracy and regression in the valence-arousal plane.
In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with emotion labels in terms of valence and arousal. We evaluate our approach in a quantitative manner in two ways, first by measuring its note prediction accuracy, and second via a regression task in the valence-arousal plane. Our results demonstrate that our proposed approaches outperform conditioning using control tokens which is representative of the current state of the art.