Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic Music Generation
It addresses the problem of generating coherent and structured music for AI systems, which is incremental as it synthesizes existing research without presenting new experimental results.
This literature review examines the evolution of techniques for modelling musical structure in AI-generated symbolic music, highlighting progress in capturing motifs and repetitions but noting ongoing challenges in replicating human-like thematic development across extended compositions.
Modelling musical structure is vital yet challenging for artificial intelligence systems that generate symbolic music compositions. This literature review dissects the evolution of techniques for incorporating coherent structure, from symbolic approaches to foundational and transformative deep learning methods that harness the power of computation and data across a wide variety of training paradigms. In the later stages, we review an emerging technique which we refer to as "sub-task decomposition" that involves decomposing music generation into separate high-level structural planning and content creation stages. Such systems incorporate some form of musical knowledge or neuro-symbolic methods by extracting melodic skeletons or structural templates to guide the generation. Progress is evident in capturing motifs and repetitions across all three eras reviewed, yet modelling the nuanced development of themes across extended compositions in the style of human composers remains difficult. We outline several key future directions to realize the synergistic benefits of combining approaches from all eras examined.