GLSR-VAE: Geodesic Latent Space Regularization for Variational AutoEncoder Architectures
This work addresses the need for fine control over latent space embeddings in VAEs for creative applications like music generation, representing an incremental improvement.
The paper tackles the problem of controlling continuous attributes in data generation with VAEs by introducing GLSR-VAE, a geodesic latent space regularization method, and demonstrates its efficiency on a monophonic music generation task, enabling continuous modulation of generated sequences.
VAEs (Variational AutoEncoders) have proved to be powerful in the context of density modeling and have been used in a variety of contexts for creative purposes. In many settings, the data we model possesses continuous attributes that we would like to take into account at generation time. We propose in this paper GLSR-VAE, a Geodesic Latent Space Regularization for the Variational AutoEncoder architecture and its generalizations which allows a fine control on the embedding of the data into the latent space. When augmenting the VAE loss with this regularization, changes in the learned latent space reflects changes of the attributes of the data. This deeper understanding of the VAE latent space structure offers the possibility to modulate the attributes of the generated data in a continuous way. We demonstrate its efficiency on a monophonic music generation task where we manage to generate variations of discrete sequences in an intended and playful way.