Modeling Musical Context with Word2vec
This work addresses the challenge of modeling musical context for music analysis and generation, but it is incremental as it applies an existing NLP method to a new domain.
The authors tackled the problem of capturing complex polyphonic musical context by applying a Word2vec model to slices of Beethoven's piano sonatas, resulting in a vector space that captures tonal relationships without explicit musical information and allows for music alteration with slices having short tonal distances.
We present a semantic vector space model for capturing complex polyphonic musical context. A word2vec model based on a skip-gram representation with negative sampling was used to model slices of music from a dataset of Beethoven's piano sonatas. A visualization of the reduced vector space using t-distributed stochastic neighbor embedding shows that the resulting embedded vector space captures tonal relationships, even without any explicit information about the musical contents of the slices. Secondly, an excerpt of the Moonlight Sonata from Beethoven was altered by replacing slices based on context similarity. The resulting music shows that the selected slice based on similar word2vec context also has a relatively short tonal distance from the original slice.