Melody Classifier with Stacked-LSTM
This work addresses the problem of evaluating machine-generated music for researchers and developers in the music AI domain, providing a tool for distinguishing human from AI compositions.
This paper proposes a stacked-LSTM binary classifier to distinguish human-composed melodies from machine-generated ones. It learns from MIDI file features including pitch, position, and duration to perform this classification task.
Attempts to use generative models for music generation have been common in recent years, and some of them have achieved good results. Pieces generated by some of these models are almost indistinguishable from those being composed by human composers. However, the research on the evaluation system for machine-generated music is still at a relatively early stage, and there is no uniform standard for such tasks. This paper proposes a stacked-LSTM binary classifier based on a language model, which can be used to distinguish the human composer's work from the machine-generated melody by learning the MIDI file's pitch, position, and duration.