CVJun 6, 2018

PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds

arXiv:1806.02170v349 citations
Originality Incremental advance
AI Analysis

This addresses the need for accurate 3D motion estimation in applications like self-driving cars, using point clouds from laser scanners, but it is incremental as it builds on existing deep learning methods for scene flow.

The paper tackles the problem of estimating 3D motion from unstructured point clouds, proposing a deep neural network that jointly predicts 3D scene flow, bounding boxes, and rigid body motion in a single forward pass, achieving robust performance compared to classic and learning-based techniques.

Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes