CVFeb 17, 2015

3D Pose from Detections

arXiv:1502.04754v3
AI Analysis

This work addresses 3D reconstruction from 2D detections for computer vision applications, representing an incremental improvement with a novel algebraic solution and robust fitting.

The paper tackles the problem of inferring 3D pose and occupancy of rigid objects from 2D image detections, presenting a closed-form method that uses quadric estimation in dual-space with a minimum of three views and includes a robust ellipse fitting algorithm to handle errors, achieving demonstrated applicability in synthetic and real datasets.

We present a novel method to infer, in closed-form, a general 3D spatial occupancy and orientation of a collection of rigid objects given 2D image detections from a sequence of images. In particular, starting from 2D ellipses fitted to bounding boxes, this novel multi-view problem can be reformulated as the estimation of a quadric (ellipsoid) in 3D. We show that an efficient solution exists in the dual-space using a minimum of three views while a solution with two views is possible through the use of regularization. However, this algebraic solution can be negatively affected in the presence of gross inaccuracies in the bounding boxes estimation. To this end, we also propose a robust ellipse fitting algorithm able to improve performance in the presence of errors in the detected objects. Results on synthetic tests and on different real datasets, involving real challenging scenarios, demonstrate the applicability and potential of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes