Booster-SHOT: Boosting Stacked Homography Transformations for Multiview Pedestrian Detection with Attention
This improves pedestrian detection accuracy for surveillance systems, though it appears incremental by combining attention with existing techniques.
The paper tackles multiview pedestrian detection by proposing Booster-SHOT, an end-to-end convolutional approach with a novel Homography Attention Module (HAM), achieving state-of-the-art performance of 92.9% MODA on Wildtrack and 94.2% on MultiviewX.
Improving multi-view aggregation is integral for multi-view pedestrian detection, which aims to obtain a bird's-eye-view pedestrian occupancy map from images captured through a set of calibrated cameras. Inspired by the success of attention modules for deep neural networks, we first propose a Homography Attention Module (HAM) which is shown to boost the performance of existing end-to-end multiview detection approaches by utilizing a novel channel gate and spatial gate. Additionally, we propose Booster-SHOT, an end-to-end convolutional approach to multiview pedestrian detection incorporating our proposed HAM as well as elements from previous approaches such as view-coherent augmentation or stacked homography transformations. Booster-SHOT achieves 92.9% and 94.2% for MODA on Wildtrack and MultiviewX respectively, outperforming the state-of-the-art by 1.4% on Wildtrack and 0.5% on MultiviewX, achieving state-of-the-art performance overall for standard evaluation metrics used in multi-view pedestrian detection.