SelfOccFlow: Towards end-to-end self-supervised 3D Occupancy Flow prediction
This addresses the need for situational awareness in dynamic environments for autonomous driving, offering a self-supervised approach that eliminates reliance on costly human annotations, though it appears incremental in its technical contributions.
The paper tackles the problem of estimating 3D occupancy and motion for autonomous driving without expensive annotations, proposing a self-supervised method that disentangles static and dynamic components and learns motion through temporal aggregation. The method demonstrates efficacy on datasets like SemanticKITTI, KITTI-MOT, and nuScenes.
Estimating 3D occupancy and motion at the vehicle's surroundings is essential for autonomous driving, enabling situational awareness in dynamic environments. Existing approaches jointly learn geometry and motion but rely on expensive 3D occupancy and flow annotations, velocity labels from bounding boxes, or pretrained optical flow models. We propose a self-supervised method for 3D occupancy flow estimation that eliminates the need for human-produced annotations or external flow supervision. Our method disentangles the scene into separate static and dynamic signed distance fields and learns motion implicitly through temporal aggregation. Additionally, we introduce a strong self-supervised flow cue derived from features' cosine similarities. We demonstrate the efficacy of our 3D occupancy flow method on SemanticKITTI, KITTI-MOT, and nuScenes.