Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations
This addresses the exploration deficiency in music-conditioned 3D dance generation for applications like animation and virtual reality, representing an incremental improvement over existing methods.
The paper tackles the problem of monotonous and simplistic 3D dance generation from music by proposing the E3D2 framework, which uses a reward model from automatically-ranked demonstrations to guide reinforcement learning, resulting in improved quality and diversity of dance sequences as validated on the AIST++ dataset.
This paper presents an Exploratory 3D Dance generation framework, E3D2, designed to address the exploration capability deficiency in existing music-conditioned 3D dance generation models. Current models often generate monotonous and simplistic dance sequences that misalign with human preferences because they lack exploration capabilities. The E3D2 framework involves a reward model trained from automatically-ranked dance demonstrations, which then guides the reinforcement learning process. This approach encourages the agent to explore and generate high quality and diverse dance movement sequences. The soundness of the reward model is both theoretically and experimentally validated. Empirical experiments demonstrate the effectiveness of E3D2 on the AIST++ dataset. Project Page: https://sites.google.com/view/e3d2.