MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds
This work addresses the need for effective temporal modeling in 3D human motion estimation and generation, with applications in human-computer interaction and AR/VR, representing a novel method for a known bottleneck.
The paper tackles the problem of accurately measuring 3D human motion plausibility by introducing MoManifold, a novel prior based on decoupled joint acceleration manifolds, which outperforms existing state-of-the-art methods in tasks like denoising, recovery from partial observations, and motion refinement.
Incorporating temporal information effectively is important for accurate 3D human motion estimation and generation which have wide applications from human-computer interaction to AR/VR. In this paper, we present MoManifold, a novel human motion prior, which models plausible human motion in continuous high-dimensional motion space. Different from existing mathematical or VAE-based methods, our representation is designed based on the neural distance field, which makes human dynamics explicitly quantified to a score and thus can measure human motion plausibility. Specifically, we propose novel decoupled joint acceleration manifolds to model human dynamics from existing limited motion data. Moreover, we introduce a novel optimization method using the manifold distance as guidance, which facilitates a variety of motion-related tasks. Extensive experiments demonstrate that MoManifold outperforms existing SOTAs as a prior in several downstream tasks such as denoising real-world human mocap data, recovering human motion from partial 3D observations, mitigating jitters for SMPL-based pose estimators, and refining the results of motion in-betweening.