CVNov 25, 2024

Edge Weight Prediction For Category-Agnostic Pose Estimation

arXiv:2411.16665v11 citationsh-index: 49Has Code
Originality Incremental advance
AI Analysis

This work addresses a key limitation in pose estimation for diverse object categories, offering an incremental improvement over existing methods by optimizing graph-based localization.

The paper tackles the problem of suboptimal performance in Category-Agnostic Pose Estimation (CAPE) due to static pose graphs with equal-weight edges by introducing EdgeCape, a framework that predicts edge weights and integrates Markovian Structural Bias, achieving state-of-the-art results on the MP-100 benchmark with 100 categories and over 20K images.

Category-Agnostic Pose Estimation (CAPE) localizes keypoints across diverse object categories with a single model, using one or a few annotated support images. Recent works have shown that using a pose graph (i.e., treating keypoints as nodes in a graph rather than isolated points) helps handle occlusions and break symmetry. However, these methods assume a static pose graph with equal-weight edges, leading to suboptimal results. We introduce EdgeCape, a novel framework that overcomes these limitations by predicting the graph's edge weights which optimizes localization. To further leverage structural priors, we propose integrating Markovian Structural Bias, which modulates the self-attention interaction between nodes based on the number of hops between them. We show that this improves the model's ability to capture global spatial dependencies. Evaluated on the MP-100 benchmark, which includes 100 categories and over 20K images, EdgeCape achieves state-of-the-art results in the 1-shot setting and leads among similar-sized methods in the 5-shot setting, significantly improving keypoint localization accuracy. Our code is publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes