CVLGJun 1, 2020

DPDnet: A Robust People Detector using Deep Learning with an Overhead Depth Camera

arXiv:2006.01053v12 citations
Originality Incremental advance
AI Analysis

This provides a robust solution for people detection in applications like surveillance or crowd monitoring, though it appears incremental as it builds on existing encoder-decoder and residual layer architectures.

The paper tackles the problem of detecting multiple people from a single overhead depth image by proposing DPDnet, a deep learning method that achieves over 99% accuracy on three public datasets and runs in real-time on conventional GPUs.

In this paper we propose a method based on deep learning that detects multiple people from a single overhead depth image with high reliability. Our neural network, called DPDnet, is based on two fully-convolutional encoder-decoder neural blocks based on residual layers. The Main Block takes a depth image as input and generates a pixel-wise confidence map, where each detected person in the image is represented by a Gaussian-like distribution. The refinement block combines the depth image and the output from the main block, to refine the confidence map. Both blocks are simultaneously trained end-to-end using depth images and head position labels. The experimental work shows that DPDNet outperforms state-of-the-art methods, with accuracies greater than 99% in three different publicly available datasets, without retraining not fine-tuning. In addition, the computational complexity of our proposal is independent of the number of people in the scene and runs in real time using conventional GPUs.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes