CVJul 21, 2022

Pose for Everything: Towards Category-Agnostic Pose Estimation

Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

arXiv:2207.10387v120.659 citationsh-index: 98Has Code

Originality Highly original

AI Analysis

This addresses the need for pose estimation in diverse applications beyond specific categories like humans or animals, though it is incremental in extending existing methods to a broader scope.

The paper tackles the problem of pose estimation for unseen object categories by introducing Category-Agnostic Pose Estimation (CAPE) and a novel framework called POMNet, which outperforms baseline approaches by a large margin.

Existing works on 2D pose estimation mainly focus on a certain category, e.g. human, animal, and vehicle. However, there are lots of application scenarios that require detecting the poses/keypoints of the unseen class of objects. In this paper, we introduce the task of Category-Agnostic Pose Estimation (CAPE), which aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition. To achieve this goal, we formulate the pose estimation problem as a keypoint matching problem and design a novel CAPE framework, termed POse Matching Network (POMNet). A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images. We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms. Experiments show that our method outperforms other baseline approaches by a large margin. Codes and data are available at https://github.com/luminxu/Pose-for-Everything.

View on arXiv PDF Code

Similar