FlowerDance: MeanFlow for Efficient and Refined 3D Dance Generation
This addresses the need for efficient and high-fidelity 3D dance generation in virtual reality and entertainment, though it appears incremental as it builds on existing methods with novel components.
The paper tackled the problem of inefficient music-to-dance generation for 3D applications by proposing FlowerDance, which achieved state-of-the-art results in motion quality and generation efficiency on benchmarks like AIST++ and FineDance.
Music-to-dance generation aims to translate auditory signals into expressive human motion, with broad applications in virtual reality, choreography, and digital entertainment. Despite promising progress, the limited generation efficiency of existing methods leaves insufficient computational headroom for high-fidelity 3D rendering, thereby constraining the expressiveness of 3D characters during real-world applications. Thus, we propose FlowerDance, which not only generates refined motion with physical plausibility and artistic expressiveness, but also achieves significant generation efficiency on inference speed and memory utilization. Specifically, FlowerDance combines MeanFlow with Physical Consistency Constraints, which enables high-quality motion generation with only a few sampling steps. Moreover, FlowerDance leverages a simple but efficient model architecture with BiMamba-based backbone and Channel-Level Cross-Modal Fusion, which generates dance with efficient non-autoregressive manner. Meanwhile, FlowerDance supports motion editing, enabling users to interactively refine dance sequences. Extensive experiments on AIST++ and FineDance show that FlowerDance achieves state-of-the-art results in both motion quality and generation efficiency. Code will be released upon acceptance. Project page: https://flowerdance25.github.io/ .