CVNov 26, 2018

A Survey on Joint Object Detection and Pose Estimation using Monocular Vision

arXiv:1811.10216v113 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper that synthesizes existing research for practitioners and researchers in computer vision.

This survey provides a comprehensive overview of methods for joint object detection and pose estimation using monocular vision, covering traditional approaches, hybrid methods, and deep learning techniques while comparing their performance metrics.

In this survey we present a complete landscape of joint object detection and pose estimation methods that use monocular vision. Descriptions of traditional approaches that involve descriptors or models and various estimation methods have been provided. These descriptors or models include chordiograms, shape-aware deformable parts model, bag of boundaries, distance transform templates, natural 3D markers and facet features whereas the estimation methods include iterative clustering estimation, probabilistic networks and iterative genetic matching. Hybrid approaches that use handcrafted feature extraction followed by estimation by deep learning methods have been outlined. We have investigated and compared, wherever possible, pure deep learning based approaches (single stage and multi stage) for this problem. Comprehensive details of the various accuracy measures and metrics have been illustrated. For the purpose of giving a clear overview, the characteristics of relevant datasets are discussed. The trends that prevailed from the infancy of this problem until now have also been highlighted.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes