CVAIMar 27, 2023

Pushing the Envelope for Depth-Based Semi-Supervised 3D Hand Pose Estimation with Consistency Training

arXiv:2303.15147v14 citationsh-index: 44
Originality Incremental advance
AI Analysis

This work addresses the costly and time-consuming data collection issue for researchers and practitioners in computer vision, though it appears incremental as it builds on existing semi-supervised and consistency training paradigms.

The paper tackles the problem of reducing the need for labeled data in depth-based 3D hand pose estimation by proposing a semi-supervised method using teacher-student networks with consistency training, achieving state-of-the-art performance with large margins in experiments.

Despite the significant progress that depth-based 3D hand pose estimation methods have made in recent years, they still require a large amount of labeled training data to achieve high accuracy. However, collecting such data is both costly and time-consuming. To tackle this issue, we propose a semi-supervised method to significantly reduce the dependence on labeled training data. The proposed method consists of two identical networks trained jointly: a teacher network and a student network. The teacher network is trained using both the available labeled and unlabeled samples. It leverages the unlabeled samples via a loss formulation that encourages estimation equivariance under a set of affine transformations. The student network is trained using the unlabeled samples with their pseudo-labels provided by the teacher network. For inference at test time, only the student network is used. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art semi-supervised methods by large margins.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes