CVLGJan 30, 2025

Motion Diffusion Autoencoders: Enabling Attribute Manipulation in Human Motion Demonstrated on Karate Techniques

arXiv:2501.18729v22 citationsh-index: 24ICMI
Originality Incremental advance
AI Analysis

This work addresses the challenge of precise attribute control in human motion analysis, which is incremental as it builds on existing techniques for a specific application.

The paper tackled the problem of manipulating individual attributes in human motion data, specifically karate movements, by introducing a novel continuous pose representation and a method combining transformer encoders with diffusion models, achieving the first successful attribute manipulation in this domain with accurate reconstruction.

Attribute manipulation deals with the problem of changing individual attributes of a data point or a time series, while leaving all other aspects unaffected. This work focuses on the domain of human motion, more precisely karate movement patterns. To the best of our knowledge, it presents the first success at manipulating attributes of human motion data. One of the key requirements for achieving attribute manipulation on human motion is a suitable pose representation. Therefore, we design a novel continuous, rotation-based pose representation that enables the disentanglement of the human skeleton and the motion trajectory, while still allowing an accurate reconstruction of the original anatomy. The core idea of the manipulation approach is to use a transformer encoder for discovering high-level semantics, and a diffusion probabilistic model for modeling the remaining stochastic variations. We show that the embedding space obtained from the transformer encoder is semantically meaningful and linear. This enables the manipulation of high-level attributes, by discovering their linear direction of change in the semantic embedding space and moving the embedding along said direction. All code and data is made publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes