Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
This addresses the need for efficient perception in autonomous vehicles by improving the practicality of diffusion-based scene completion models, though it is incremental as it builds on existing diffusion methods.
The paper tackled the slow sampling speed of diffusion models for 3D LiDAR scene completion by proposing a distillation method called ScoreLiDAR, which reduced completion time from 30.55 to 5.37 seconds per frame (>5x acceleration) while achieving superior performance on SemanticKITTI.
Diffusion models have been applied to 3D LiDAR scene completion due to their strong training stability and high completion quality. However, the slow sampling speed limits the practical application of diffusion-based scene completion models since autonomous vehicles require an efficient perception of surrounding environments. This paper proposes a novel distillation method tailored for 3D Li- DAR scene completion models, dubbed ScoreLiDAR, which achieves efficient yet high-quality scene completion. Score- LiDAR enables the distilled model to sample in significantly fewer steps after distillation. To improve completion quality, we also introduce a novel Structural Loss, which encourages the distilled model to capture the geometric structure of the 3D LiDAR scene. The loss contains a scene-wise term constraining the holistic structure and a point-wise term constraining the key landmark points and their relative configuration. Extensive experiments demonstrate that ScoreLiDAR significantly accelerates the completion time from 30.55 to 5.37 seconds per frame (>5x) on SemanticKITTI and achieves superior performance compared to state-of-the-art 3D LiDAR scene completion models. Our model and code are publicly available on https://github.com/happyw1nd/ScoreLiDAR.