Learning a Latent Space of Multitrack Measures
This work addresses the challenge of intuitive control and generation of rich instrumental music for applications in music production and AI creativity, though it is incremental as it builds on an existing model.
The authors tackled the problem of representing and generating multi-instrumental music by extending the MusicVAE model to create a latent space for multitrack polyphonic measures, enabling operations like generation, interpolation, and chord-conditioned manipulation to produce music with convincing long-term structure.
Discovering and exploring the underlying structure of multi-instrumental music using learning-based approaches remains an open problem. We extend the recent MusicVAE model to represent multitrack polyphonic measures as vectors in a latent space. Our approach enables several useful operations such as generating plausible measures from scratch, interpolating between measures in a musically meaningful way, and manipulating specific musical attributes. We also introduce chord conditioning, which allows all of these operations to be performed while keeping harmony fixed, and allows chords to be changed while maintaining musical "style". By generating a sequence of measures over a predefined chord progression, our model can produce music with convincing long-term structure. We demonstrate that our latent space model makes it possible to intuitively control and generate musical sequences with rich instrumentation (see https://goo.gl/s2N7dV for generated audio).