Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk
This work addresses the scarcity of annotated data for scene flow estimation in point clouds, offering a self-supervised solution that improves accuracy for applications like autonomous driving and robotics, though it is incremental as it builds on existing self-supervised matching approaches.
The paper tackles the problem of self-supervised scene flow estimation from point clouds by addressing issues in point-wise matching, such as ignoring discriminative features and allowing many-to-one correspondences, using optimal transport with multiple descriptors and a random walk module for local consistency; it achieves state-of-the-art performance among self-supervised methods on FlyingThings3D and KITTI, matching some supervised approaches without ground truth.
Due to the scarcity of annotated scene flow data, self-supervised scene flow learning in point clouds has attracted increasing attention. In the self-supervised manner, establishing correspondences between two point clouds to approximate scene flow is an effective approach. Previous methods often obtain correspondences by applying point-wise matching that only takes the distance on 3D point coordinates into account, introducing two critical issues: (1) it overlooks other discriminative measures, such as color and surface normal, which often bring fruitful clues for accurate matching; and (2) it often generates sub-par performance, as the matching is operated in an unconstrained situation, where multiple points can be ended up with the same corresponding point. To address the issues, we formulate this matching task as an optimal transport problem. The output optimal assignment matrix can be utilized to guide the generation of pseudo ground truth. In this optimal transport, we design the transport cost by considering multiple descriptors and encourage one-to-one matching by mass equality constraints. Also, constructing a graph on the points, a random walk module is introduced to encourage the local consistency of the pseudo labels. Comprehensive experiments on FlyingThings3D and KITTI show that our method achieves state-of-the-art performance among self-supervised learning methods. Our self-supervised method even performs on par with some supervised learning approaches, although we do not need any ground truth flow for training.