Detecting Humans in RGB-D Data with CNNs
This addresses people detection for robotics or surveillance applications, but appears incremental as it builds on existing CNN methods with depth enhancements.
The paper tackles human detection in RGB-D data by developing a region-of-interest selection method and a novel fusion approach for color and depth CNNs, along with a new depth-encoding scheme, and shows it outperforms RGB-only baselines on a public dataset.
We address the problem of people detection in RGB-D data where we leverage depth information to develop a region-of-interest (ROI) selection method that provides proposals to two color and depth CNNs. To combine the detections produced by the two CNNs, we propose a novel fusion approach based on the characteristics of depth images. We also present a new depth-encoding scheme, which not only encodes depth images into three channels but also enhances the information for classification. We conduct experiments on a publicly available RGB-D people dataset and show that our approach outperforms the baseline models that only use RGB data.