Car Segmentation and Pose Estimation using 3D Object Models
This work addresses scene understanding for computer vision applications, but it is incremental as it builds on existing CRF-based models by incorporating 3D models.
The paper tackled the problem of image segmentation and 3D pose estimation by proposing new top-down potentials based on 3D object models, showing that these potentials can be decomposed for efficient inference and that segmentation and pose estimation mutually improve each other on a car dataset.
Image segmentation and 3D pose estimation are two key cogs in any algorithm for scene understanding. However, state-of-the-art CRF-based models for image segmentation rely mostly on 2D object models to construct top-down high-order potentials. In this paper, we propose new top-down potentials for image segmentation and pose estimation based on the shape and volume of a 3D object model. We show that these complex top-down potentials can be easily decomposed into standard forms for efficient inference in both the segmentation and pose estimation tasks. Experiments on a car dataset show that knowledge of segmentation helps perform pose estimation better and vice versa.