CVAug 15, 2021

EventHPE: Event-based 3D Human Pose and Shape Estimation

arXiv:2108.06819v177 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of 3D human motion capture for robotics or AR/VR using event cameras, but it is incremental as it builds on existing methods with a new loss and dataset.

The paper tackles 3D human pose and shape estimation from event camera signals by proposing EventHPE, a two-stage deep learning approach with a novel flow coherence loss, achieving effectiveness validated on datasets including a new in-house one.

Event camera is an emerging imaging sensor for capturing dynamics of moving objects as events, which motivates our work in estimating 3D human pose and shape from the event signals. Events, on the other hand, have their unique challenges: rather than capturing static body postures, the event signals are best at capturing local motions. This leads us to propose a two-stage deep learning approach, called EventHPE. The first-stage, FlowNet, is trained by unsupervised learning to infer optical flow from events. Both events and optical flow are closely related to human body dynamics, which are fed as input to the ShapeNet in the second stage, to estimate 3D human shapes. To mitigate the discrepancy between image-based flow (optical flow) and shape-based flow (vertices movement of human body shape), a novel flow coherence loss is introduced by exploiting the fact that both flows are originated from the identical human motion. An in-house event-based 3D human dataset is curated that comes with 3D pose and shape annotations, which is by far the largest one to our knowledge. Empirical evaluations on DHP19 dataset and our in-house dataset demonstrate the effectiveness of our approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes