CVMar 15, 2021

S-AT GCN: Spatial-Attention Graph Convolution Network based Feature Enhancement for 3D Object Detection

Li Wang, Chenfei Wang, Xinyu Zhang, Tianwei Lan, Jun Li

arXiv:2103.08439v12.613 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in 3D object detection for autonomous driving, offering incremental improvements for small object detection.

The paper tackles the partition effect in 3D object detection for autonomous vehicles, where objects like pedestrians are split into pieces, and proposes S-AT GCN with Feature Enhancement layers to improve detection, resulting in performance boosts of 3.62% for pedestrians and 4.21% for cyclists in 3D mAP on KITTI.

3D object detection plays a crucial role in environmental perception for autonomous vehicles, which is the prerequisite of decision and control. This paper analyses partition-based methods' inherent drawbacks. In the partition operation, a single instance such as a pedestrian is sliced into several pieces, which we call it the partition effect. We propose the Spatial-Attention Graph Convolution (S-AT GCN), forming the Feature Enhancement (FE) layers to overcome this drawback. The S-AT GCN utilizes the graph convolution and the spatial attention mechanism to extract local geometrical structure features. This allows the network to have more meaningful features for the foreground. Our experiments on the KITTI 3D object and bird's eye view detection show that S-AT Conv and FE layers are effective, especially for small objects. FE layers boost the pedestrian class performance by 3.62\% and cyclist class by 4.21\% 3D mAP. The time cost of these extra FE layers are limited. PointPillars with FE layers can achieve 48 PFS, satisfying the real-time requirement.

View on arXiv PDF Code

Similar