CVMay 12

Enhancing Domain Generalization in 3D Human Pose Estimation through Controllable Generative Augmentation

arXiv:2605.1219849.8
Predicted impact top 69% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in 3D human pose estimation, this work addresses domain generalization by providing a data augmentation method that improves model robustness to domain shifts.

This work presents a controllable human pose generation framework that synthesizes diverse video data by varying poses, backgrounds, and camera viewpoints to augment training datasets for 3D human pose estimation, achieving significant performance improvements on unseen scenarios and datasets.

Pedestrian motion, due to its causal nature, is strongly influenced by domain gaps arising from discrepancies between training and testing data distributions. Focusing on 3D human pose estimation, this work presents a controllable human pose generation framework that synthesizes diverse video data by systematically varying poses, backgrounds, and camera viewpoints. This generative augmentation enriches training datasets, enhances model generalization, and alleviates the limitations of existing methods in handling domain discrepancies. By leveraging both indoor/real-world and outdoor/virtual datasets, we perform cross-domain data fusion and controllable video generation to construct enriched training data, tailored to realistic deployment settings. Extensive experiments show that the augmented datasets significantly improve model performance on unseen scenarios and datasets, validating the effectiveness of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes