VirtualConductor: Music-driven Conducting Video Generation System
This enables any user to become a virtual conductor, but it is incremental as it builds on existing motion generation and pose transfer methods.
The authors tackled the problem of generating conducting videos from music and a user's image, resulting in a system that produces diverse, plausible, and music-synchronized motion using a novel network and rendering techniques.
In this demo, we present VirtualConductor, a system that can generate conducting video from any given music and a single user's image. First, a large-scale conductor motion dataset is collected and constructed. Then, we propose Audio Motion Correspondence Network (AMCNet) and adversarial-perceptual learning to learn the cross-modal relationship and generate diverse, plausible, music-synchronized motion. Finally, we combine 3D animation rendering and a pose transfer model to synthesize conducting video from a single given user's image. Therefore, any user can become a virtual conductor through the system.