CVNov 25, 2024

Edge Weight Prediction For Category-Agnostic Pose Estimation

arXiv:2411.16665v12.01 citationsh-index: 49Has Code

Originality Incremental advance

AI Analysis

This work addresses a key limitation in pose estimation for diverse object categories, offering an incremental improvement over existing methods by optimizing graph-based localization.

The paper tackles the problem of suboptimal performance in Category-Agnostic Pose Estimation (CAPE) due to static pose graphs with equal-weight edges by introducing EdgeCape, a framework that predicts edge weights and integrates Markovian Structural Bias, achieving state-of-the-art results on the MP-100 benchmark with 100 categories and over 20K images.

Category-Agnostic Pose Estimation (CAPE) localizes keypoints across diverse object categories with a single model, using one or a few annotated support images. Recent works have shown that using a pose graph (i.e., treating keypoints as nodes in a graph rather than isolated points) helps handle occlusions and break symmetry. However, these methods assume a static pose graph with equal-weight edges, leading to suboptimal results. We introduce EdgeCape, a novel framework that overcomes these limitations by predicting the graph's edge weights which optimizes localization. To further leverage structural priors, we propose integrating Markovian Structural Bias, which modulates the self-attention interaction between nodes based on the number of hops between them. We show that this improves the model's ability to capture global spatial dependencies. Evaluated on the MP-100 benchmark, which includes 100 categories and over 20K images, EdgeCape achieves state-of-the-art results in the 1-shot setting and leads among similar-sized methods in the 5-shot setting, significantly improving keypoint localization accuracy. Our code is publicly available.

View on arXiv PDF Code

Similar