SBNet: Sparse Blocks Network for Fast Inference
This work addresses the problem of slow inference times in deep learning for real-time tasks like object detection, offering a practical speed-up, though it is incremental as it builds on prior sparse activation methods.
The paper tackled the high computational cost of dense convolutional neural networks for real-time applications by using computation masks to reduce computation in high-resolution networks, achieving significant wall-clock speed-ups without noticeable accuracy loss in LiDAR-based 3D object detection.
Conventional deep convolutional neural networks (CNNs) apply convolution operators uniformly in space across all feature maps for hundreds of layers - this incurs a high computational cost for real-time applications. For many problems such as object detection and semantic segmentation, we are able to obtain a low-cost computation mask, either from a priori problem knowledge, or from a low-resolution segmentation network. We show that such computation masks can be used to reduce computation in the high-resolution main network. Variants of sparse activation CNNs have previously been explored on small-scale tasks and showed no degradation in terms of object classification accuracy, but often measured gains in terms of theoretical FLOPs without realizing a practical speed-up when compared to highly optimized dense convolution implementations. In this work, we leverage the sparsity structure of computation masks and propose a novel tiling-based sparse convolution algorithm. We verified the effectiveness of our sparse CNN on LiDAR-based 3D object detection, and we report significant wall-clock speed-ups compared to dense convolution without noticeable loss of accuracy.