CVNov 27, 2018

Part-level Car Parsing and Reconstruction from Single Street View

arXiv:1811.10837v25 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of robust car parsing and reconstruction for autonomous driving and urban analysis, though it is incremental by building on existing part-based and 3D estimation techniques.

The paper tackles the problem of estimating 3D shape, pose, and semantic parts of cars from single street view images, achieving state-of-the-art performance with improvements of 12.57 and 8.91 percentage points over prior methods on the ApolloCar3D dataset.

Part information has been shown to be resistant to occlusions and viewpoint changes, which is beneficial for various vision-related tasks. However, we found very limited work in car pose estimation and reconstruction from street views leveraging the part information. There are two major contributions in this paper. Firstly, we make the first attempt to build a framework to simultaneously estimate shape, translation, orientation, and semantic parts of cars in 3D space from a single street view. As it is labor-intensive to annotate semantic parts on real street views, we propose a specific approach to implicitly transfer part features from synthesized images to real street views. For pose and shape estimation, we propose a novel network structure that utilizes both part features and 3D losses. Secondly, we are the first to construct a high-quality dataset that contains 348 different car models with physical dimensions and part-level annotations based on global and local deformations. Given these models, we further generate 60K synthesized images with randomization of orientation, illumination, occlusion, and texture. Our results demonstrate that our part segmentation performance is significantly improved after applying our implicit transfer approach. Our network for pose and shape estimation achieves the state-of-the-art performance on the ApolloCar3D dataset and outperforms 3D-RCNN and DeepMANTA by 12.57 and 8.91 percentage points in terms of mean A3DP-Abs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes