Music-driven Dance Regeneration with Controllable Key Pose Constraints
This addresses the problem of customizable dance generation for users in creative applications, though it appears incremental as it builds on existing music-driven synthesis with added pose control.
The paper tackles music-driven dance motion synthesis with user-specified key pose constraints, enabling generation of dance sequences that align with both music and customized poses. Their model achieves satisfactory performance in quantitative and qualitative evaluations.
In this paper, we propose a novel framework for music-driven dance motion synthesis with controllable key pose constraint. In contrast to methods that generate dance motion sequences only based on music without any other controllable conditions, this work targets on synthesizing high-quality dance motion driven by music as well as customized poses performed by users. Our model involves two single-modal transformer encoders for music and motion representations and a cross-modal transformer decoder for dance motions generation. The cross-modal transformer decoder achieves the capability of synthesizing smooth dance motion sequences, which keeps a consistency with key poses at corresponding positions, by introducing the local neighbor position embedding. Such mechanism makes the decoder more sensitive to key poses and the corresponding positions. Our dance synthesis model achieves satisfactory performance both on quantitative and qualitative evaluations with extensive experiments, which demonstrates the effectiveness of our proposed method.