SF2SE3: Clustering Scene Flow into SE(3)-Motions via Proposal and Selection
This addresses scene understanding for robotics and autonomous systems, but it is incremental as it builds on existing optical flow and depth estimation methods with novel sampling and selection strategies.
The paper tackles the problem of estimating scene dynamics by segmenting independently moving rigid objects and their SE(3)-motions from two consecutive stereo or RGB-D images, achieving performance on par with state-of-the-art for scene flow estimation and improved accuracy for segmentation and odometry.
We propose SF2SE3, a novel approach to estimate scene dynamics in form of a segmentation into independently moving rigid objects and their SE(3)-motions. SF2SE3 operates on two consecutive stereo or RGB-D images. First, noisy scene flow is obtained by application of existing optical flow and depth estimation algorithms. SF2SE3 then iteratively (1) samples pixel sets to compute SE(3)-motion proposals, and (2) selects the best SE(3)-motion proposal with respect to a maximum coverage formulation. Finally, objects are formed by assigning pixels uniquely to the selected SE(3)-motions based on consistency with the input scene flow and spatial proximity. The main novelties are a more informed strategy for the sampling of motion proposals and a maximum coverage formulation for the proposal selection. We conduct evaluations on multiple datasets regarding application of SF2SE3 for scene flow estimation, object segmentation and visual odometry. SF2SE3 performs on par with the state of the art for scene flow estimation and is more accurate for segmentation and odometry.