CVROMay 7, 2021

Towards Real-World Category-level Articulation Pose Estimation

arXiv:2105.03260v163 citations
AI Analysis

This work addresses the challenge of estimating articulated object poses in complex real-world scenarios, which is incremental as it builds on prior CAPE methods by extending the problem setting and providing new datasets.

The paper tackles the problem of category-level articulation pose estimation in real-world environments by introducing a new task setting (CAPER) that allows varied kinematic structures and multiple instances, and proposes the ReArtNOCS framework which achieves good performance on both CAPER and existing CAPE settings.

Human life is populated with articulated objects. Current Category-level Articulation Pose Estimation (CAPE) methods are studied under the single-instance setting with a fixed kinematic structure for each category. Considering these limitations, we reform this problem setting for real-world environments and suggest a CAPE-Real (CAPER) task setting. This setting allows varied kinematic structures within a semantic category, and multiple instances to co-exist in an observation of real world. To support this task, we build an articulated model repository ReArt-48 and present an efficient dataset generation pipeline, which contains Fast Articulated Object Modeling (FAOM) and Semi-Authentic MixEd Reality Technique (SAMERT). Accompanying the pipeline, we build a large-scale mixed reality dataset ReArtMix and a real world dataset ReArtVal. We also propose an effective framework ReArtNOCS that exploits RGB-D input to estimate part-level pose for multiple instances in a single forward pass. Extensive experiments demonstrate that the proposed ReArtNOCS can achieve good performance on both CAPER and CAPE settings. We believe it could serve as a strong baseline for future research on the CAPER task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes