A Survey of Music Generation in the Context of Interaction
It addresses the problem of enabling interactive music co-creation for musicians and researchers, but it is incremental as it primarily reviews existing methods.
The paper surveys music generation techniques, identifying that current machine learning models excel at style replication and transfer but are unsuitable for live human-machine co-creation, and it reviews approaches to address this gap.
In recent years, machine learning, and in particular generative adversarial neural networks (GANs) and attention-based neural networks (transformers), have been successfully used to compose and generate music, both melodies and polyphonic pieces. Current research focuses foremost on style replication (eg. generating a Bach-style chorale) or style transfer (eg. classical to jazz) based on large amounts of recorded or transcribed music, which in turn also allows for fairly straight-forward "performance" evaluation. However, most of these models are not suitable for human-machine co-creation through live interaction, neither is clear, how such models and resulting creations would be evaluated. This article presents a thorough review of music representation, feature analysis, heuristic algorithms, statistical and parametric modelling, and human and automatic evaluation measures, along with a discussion of which approaches and models seem most suitable for live interaction.