Listen to Dance: Music-driven choreography generation using Autoregressive Encoder-Decoder Network
This addresses the challenge of automatic choreography generation for applications in entertainment and creative arts, though it is incremental as it builds on existing encoder-decoder methods.
The paper tackles the problem of generating dance choreography from music by proposing an autoregressive encoder-decoder network that uses music-choreography pairs for training. Results from a user study indicate the model generates musically meaningful and natural dance movements for unheard songs.
Automatic choreography generation is a challenging task because it often requires an understanding of two abstract concepts - music and dance - which are realized in the two different modalities, namely audio and video, respectively. In this paper, we propose a music-driven choreography generation system using an auto-regressive encoder-decoder network. To this end, we first collect a set of multimedia clips that include both music and corresponding dance motion. We then extract the joint coordinates of the dancer from video and the mel-spectrogram of music from audio, and train our network using music-choreography pairs as input. Finally, a novel dance motion is generated at the inference time when only music is given as an input. We performed a user study for a qualitative evaluation of the proposed method, and the results show that the proposed model is able to generate musically meaningful and natural dance movements given an unheard song.