CVAISep 11, 2019

Adaptive Wasserstein Hourglass for Weakly Supervised Hand Pose Estimation from Monocular RGB

arXiv:1909.05666v18 citations
Originality Incremental advance
AI Analysis

This work addresses the bottleneck of data scarcity in 3D hand pose estimation for computer vision applications, offering a domain adaptation approach that is incremental in nature.

The paper tackles the problem of insufficient labeled training data for 3D hand pose estimation from monocular RGB images by proposing a domain adaptation method called Adaptive Wasserstein Hourglass, which bridges the gap between synthetic and real-world datasets to enable weakly-supervised estimation, achieving improved generalization as indicated by competitive results on benchmarks.

Insufficient labeled training datasets is one of the bottlenecks of 3D hand pose estimation from monocular RGB images. Synthetic datasets have a large number of images with precise annotations, but the obvious difference with real-world datasets impacts the generalization. Little work has been done to bridge the gap between two domains over their wide difference. In this paper, we propose a domain adaptation method called Adaptive Wasserstein Hourglass (AW Hourglass) for weakly-supervised 3D hand pose estimation, which aims to distinguish the difference and explore the common characteristics (e.g. hand structure) of synthetic and real-world datasets. Learning the common characteristics helps the network focus on pose-related information. The similarity of the characteristics makes it easier to enforce domain-invariant constraints. During training, based on the relation between these common characteristics and 3D pose learned from fully-annotated synthetic datasets, it is beneficial for the network to restore the 3D pose of weakly labeled real-world datasets with the aid of 2D annotations and depth images. While in testing, the network predicts the 3D pose with the input of RGB.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes