SYMPLEX: Controllable Symbolic Music Generation using Simplex Diffusion with Vocabulary Priors
This addresses the problem of flexible music generation for creative applications, though it appears incremental as it adapts an existing NLP method to a new domain.
The paper tackles controllable symbolic music generation by applying simplex diffusion to 4-bar multi-instrument music loops, achieving control over aspects like infilling and instrumentation without task-specific model adaptation.
We present a new approach for fast and controllable generation of symbolic music based on the simplex diffusion, which is essentially a diffusion process operating on probabilities rather than the signal space. This objective has been applied in domains such as natural language processing but here we apply it to generating 4-bar multi-instrument music loops using an orderless representation. We show that our model can be steered with vocabulary priors, which affords a considerable level control over the music generation process, for instance, infilling in time and pitch and choice of instrumentation -- all without task-specific model adaptation or applying extrinsic control.