CVJun 5, 2022

Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

arXiv:2206.02099v1237 citationsh-index: 128Has Code
Originality Incremental advance
AI Analysis

This work addresses efficient model deployment for autonomous driving by enabling slim student models to match teacher performance, though it is incremental as it builds on existing distillation methods.

The paper tackles the problem of knowledge distillation for LiDAR semantic segmentation by proposing Point-to-Voxel Knowledge Distillation (PVD), which transfers knowledge at both point and voxel levels to address sparsity and density challenges, achieving roughly 75% MACs reduction, 2x speedup, and ranking 1st on the SemanticKITTI leaderboard.

This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation. Directly employing previous distillation approaches yields inferior results due to the intrinsic challenges of point cloud, i.e., sparsity, randomness and varying density. To tackle the aforementioned problems, we propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level. Specifically, we first leverage both the pointwise and voxelwise output distillation to complement the sparse supervision signals. Then, to better exploit the structural information, we divide the whole point cloud into several supervoxels and design a difficulty-aware sampling strategy to more frequently sample supervoxels containing less-frequent classes and faraway objects. On these supervoxels, we propose inter-point and inter-voxel affinity distillation, where the similarity information between points and voxels can help the student model better capture the structural information of the surrounding environment. We conduct extensive experiments on two popular LiDAR segmentation benchmarks, i.e., nuScenes and SemanticKITTI. On both benchmarks, our PVD consistently outperforms previous distillation approaches by a large margin on three representative backbones, i.e., Cylinder3D, SPVNAS and MinkowskiNet. Notably, on the challenging nuScenes and SemanticKITTI datasets, our method can achieve roughly 75% MACs reduction and 2x speedup on the competitive Cylinder3D model and rank 1st on the SemanticKITTI leaderboard among all published algorithms. Our code is available at https://github.com/cardwing/Codes-for-PVKD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes