MM LG SD ASNov 7, 2023

Are Words Enough? On the semantic conditioning of affective music generation

Jorge Forero, Gilberto Bernardes, Mónica Mendes

arXiv:2311.03624v12.32 citationsh-index: 2

Originality Synthesis-oriented

AI Analysis

This is an incremental scoping review that addresses the challenge of verbalizing emotions for music generation, relevant to researchers and creative professionals.

The paper reviews the potential of generating music conditioned by emotions using natural language processing, concluding that deep learning models can help overcome language limitations to impact creative industries.

Music has been commonly recognized as a means of expressing emotions. In this sense, an intense debate emerges from the need to verbalize musical emotions. This concern seems highly relevant today, considering the exponential growth of natural language processing using deep learning models where it is possible to prompt semantic propositions to generate music automatically. This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions. To address this topic, we propose a historical perspective that encompasses the different disciplines and methods contributing to this topic. In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models. Of note are the deep learning architectures that aim to generate high-fidelity music from textual descriptions. These models raise fundamental questions about the expressivity of music, including whether emotions can be represented with words or expressed through them. We conclude that overcoming the limitation and ambiguity of language to express emotions through music, some of the use of deep learning with natural language has the potential to impact the creative industries by providing powerful tools to prompt and generate new musical works.

View on arXiv PDF

Similar