CVLGROSep 10, 2019

FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images

arXiv:1909.04349v3500 citations
Originality Incremental advance
AI Analysis

This addresses the need for unbiased training data in hand pose and shape estimation, benefiting computer vision and human-computer interaction, though it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of poor cross-dataset generalization in 3D hand pose estimation from single RGB images by introducing FreiHAND, a large-scale, multi-view dataset with 3D pose and shape annotations, which enables methods trained on it to perform consistently well on other datasets and allows training a network for full articulated hand shape prediction.

Estimating 3D hand pose from single RGB images is a highly ambiguous problem that relies on an unbiased training dataset. In this paper, we analyze cross-dataset generalization when training on existing datasets. We find that approaches perform well on the datasets they are trained on, but do not generalize to other datasets or in-the-wild scenarios. As a consequence, we introduce the first large-scale, multi-view hand dataset that is accompanied by both 3D hand pose and shape annotations. For annotating this real-world dataset, we propose an iterative, semi-automated `human-in-the-loop' approach, which includes hand fitting optimization to infer both the 3D pose and shape for each sample. We show that methods trained on our dataset consistently perform well when tested on other datasets. Moreover, the dataset allows us to train a network that predicts the full articulated hand shape from a single RGB image. The evaluation set can serve as a benchmark for articulated hand shape estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes