CVNov 17, 2022

Detecting Arbitrary Keypoints on Limbs and Skis with Sparse Partly Correct Segmentation Masks

arXiv:2211.09446v110 citationsh-index: 42
Originality Incremental advance
AI Analysis

This addresses the need for efficient posture analysis in sports like ski jumping, where manual labeling is expensive, though it is incremental as it builds on existing Vision Transformer methods.

The paper tackles the problem of detecting arbitrary keypoints on limbs and skis for professional ski jumpers, using only a few partly correct segmentation masks during training, and shows that this approach is sufficient for learning keypoint detection without costly manual annotations.

Analyses based on the body posture are crucial for top-class athletes in many sports disciplines. If at all, coaches label only the most important keypoints, since manual annotations are very costly. This paper proposes a method to detect arbitrary keypoints on the limbs and skis of professional ski jumpers that requires a few, only partly correct segmentation masks during training. Our model is based on the Vision Transformer architecture with a special design for the input tokens to query for the desired keypoints. Since we use segmentation masks only to generate ground truth labels for the freely selectable keypoints, partly correct segmentation masks are sufficient for our training procedure. Hence, there is no need for costly hand-annotated segmentation masks. We analyze different training techniques for freely selected and standard keypoints, including pseudo labels, and show in our experiments that only a few partly correct segmentation masks are sufficient for learning to detect arbitrary keypoints on limbs and skis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes