Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection
This work addresses the problem of fast and robust pedestrian detection for applications like autonomous driving, but it is incremental as it builds on existing deep learning methods with a novel fusion approach.
The authors tackled pedestrian detection by proposing a deep neural network fusion architecture that processes multiple networks in parallel for speed and robustness, achieving state-of-the-art performance on the Caltech Pedestrian dataset with significant boosts on several protocols and faster processing than other methods.
We propose a deep neural network fusion architecture for fast and robust pedestrian detection. The proposed network fusion architecture allows for parallel processing of multiple networks for speed. A single shot deep convolutional network is trained as a object detector to generate all possible pedestrian candidates of different sizes and occlusions. This network outputs a large variety of pedestrian candidates to cover the majority of ground-truth pedestrians while also introducing a large number of false positives. Next, multiple deep neural networks are used in parallel for further refinement of these pedestrian candidates. We introduce a soft-rejection based network fusion method to fuse the soft metrics from all networks together to generate the final confidence scores. Our method performs better than existing state-of-the-arts, especially when detecting small-size and occluded pedestrians. Furthermore, we propose a method for integrating pixel-wise semantic segmentation network into the network fusion architecture as a reinforcement to the pedestrian detector. The approach outperforms state-of-the-art methods on most protocols on Caltech Pedestrian dataset, with significant boosts on several protocols. It is also faster than all other methods.