CVDec 22, 2023

PoseGen: Learning to Generate 3D Human Pose Dataset with NeRF

arXiv:2312.14915v14 citationsh-index: 11AAAI
Originality Incremental advance
AI Analysis

This work addresses the problem of poor generalization in 3D human pose estimation for computer vision researchers by generating out-of-distribution data to enhance model robustness, representing an incremental advance with a novel application of NeRF.

The paper tackles the limited diversity in 3D human pose datasets by proposing PoseGen, an end-to-end framework that uses Neural Radiance Fields (NeRF) to generate datasets optimized to improve pre-trained pose estimators, resulting in an average 6% relative improvement on baseline models across four datasets.

This paper proposes an end-to-end framework for generating 3D human pose datasets using Neural Radiance Fields (NeRF). Public datasets generally have limited diversity in terms of human poses and camera viewpoints, largely due to the resource-intensive nature of collecting 3D human pose data. As a result, pose estimators trained on public datasets significantly underperform when applied to unseen out-of-distribution samples. Previous works proposed augmenting public datasets by generating 2D-3D pose pairs or rendering a large amount of random data. Such approaches either overlook image rendering or result in suboptimal datasets for pre-trained models. Here we propose PoseGen, which learns to generate a dataset (human 3D poses and images) with a feedback loss from a given pre-trained pose estimator. In contrast to prior art, our generated data is optimized to improve the robustness of the pre-trained model. The objective of PoseGen is to learn a distribution of data that maximizes the prediction error of a given pre-trained model. As the learned data distribution contains OOD samples of the pre-trained model, sampling data from such a distribution for further fine-tuning a pre-trained model improves the generalizability of the model. This is the first work that proposes NeRFs for 3D human data generation. NeRFs are data-driven and do not require 3D scans of humans. Therefore, using NeRF for data generation is a new direction for convenient user-specific data generation. Our extensive experiments show that the proposed PoseGen improves two baseline models (SPIN and HybrIK) on four datasets with an average 6% relative improvement.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes