CVLGIVMar 16, 2019

GFD-SSD: Gated Fusion Double SSD for Multispectral Pedestrian Detection

arXiv:1903.06999v279 citations
Originality Incremental advance
AI Analysis

This work addresses pedestrian detection for autonomous driving by improving accuracy and speed in multispectral scenarios, though it is incremental as it builds on existing SSD and fusion methods.

The paper tackled multispectral pedestrian detection by proposing a Gated Fusion Double SSD (GFD-SSD) that fuses color and thermal images using novel Gated Fusion Units, achieving the lowest miss rate on the KAIST dataset and inference speeds two times faster than Faster-RCNN-based fusion networks.

Pedestrian detection is an essential task in autonomous driving research. In addition to typical color images, thermal images benefit the detection in dark environments. Hence, it is worthwhile to explore an integrated approach to take advantage of both color and thermal images simultaneously. In this paper, we propose a novel approach to fuse color and thermal sensors using deep neural networks (DNN). Current state-of-the-art DNN object detectors vary from two-stage to one-stage mechanisms. Two-stage detectors, like Faster-RCNN, achieve higher accuracy, while one-stage detectors such as Single Shot Detector (SSD) demonstrate faster performance. To balance the trade-off, especially in the consideration of autonomous driving applications, we investigate a fusion strategy to combine two SSDs on color and thermal inputs. Traditional fusion methods stack selected features from each channel and adjust their weights. In this paper, we propose two variations of novel Gated Fusion Units (GFU), that learn the combination of feature maps generated by the two SSD middle layers. Leveraging GFUs for the entire feature pyramid structure, we propose several mixed versions of both stack fusion and gated fusion. Experiments are conducted on the KAIST multispectral pedestrian detection dataset. Our Gated Fusion Double SSD (GFD-SSD) outperforms the stacked fusion and achieves the lowest miss rate in the benchmark, at an inference speed that is two times faster than Faster-RCNN based fusion networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes