Emotion-Conditioned Melody Harmonization with Hierarchical Variational Autoencoder
This work addresses the incremental improvement of melody harmonization for music generation by adding emotional conditioning and enhancing variability.
The paper tackles the problem of melody harmonization by incorporating emotional conditions and improving variability, resulting in a model that outperforms other LSTM-based models in objective experiments and generates variable harmonies.
Existing melody harmonization models have made great progress in improving the quality of generated harmonies, but most of them ignored the emotions beneath the music. Meanwhile, the variability of harmonies generated by previous methods is insufficient. To solve these problems, we propose a novel LSTM-based Hierarchical Variational Auto-Encoder (LHVAE) to investigate the influence of emotional conditions on melody harmonization, while improving the quality of generated harmonies and capturing the abundant variability of chord progressions. Specifically, LHVAE incorporates latent variables and emotional conditions at different levels (piece- and bar-level) to model the global and local music properties. Additionally, we introduce an attention-based melody context vector at each step to better learn the correspondence between melodies and harmonies. Objective experimental results show that our proposed model outperforms other LSTM-based models. Through subjective evaluation, we conclude that only altering the types of chords hardly changes the overall emotion of the music. The qualitative analysis demonstrates the ability of our model to generate variable harmonies.