CV AIOct 14, 2025

Unconditional Human Motion and Shape Generation via Balanced Score-Based Diffusion

David Björkstrand, Tiesheng Wang, Lars Bretzner, Josephine Sullivan

arXiv:2510.12537v18.42 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses the problem of efficient and high-quality human motion and shape generation for applications like animation or virtual reality, but it is incremental as it builds on existing diffusion models with specific optimizations.

The paper tackles unconditional human motion and shape generation by showing that a score-based diffusion model with careful feature normalization and analytically derived loss weightings achieves state-of-the-art results, generating both motion and shape directly without slow post-processing.

Recent work has explored a range of model families for human motion generation, including Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and diffusion-based models. Despite their differences, many methods rely on over-parameterized input features and auxiliary losses to improve empirical results. These strategies should not be strictly necessary for diffusion models to match the human motion distribution. We show that on par with state-of-the-art results in unconditional human motion generation are achievable with a score-based diffusion model using only careful feature-space normalization and analytically derived weightings for the standard L2 score-matching loss, while generating both motion and shape directly, thereby avoiding slow post hoc shape recovery from joints. We build the method step by step, with a clear theoretical motivation for each component, and provide targeted ablations demonstrating the effectiveness of each proposed addition in isolation.

View on arXiv PDF

Similar