Two-stream convolutional neural network for accurate RGB-D fingertip detection using depth and edge information
This work addresses fingertip detection for human-computer interaction, presenting an incremental improvement over existing methods.
The paper tackles the problem of accurate fingertip detection in depth images for human-computer interaction by proposing a two-stream CNN that combines depth and edge information, achieving a state-of-the-art average 3D error of 9.9mm on the HandNet dataset and comparable accuracy on the NYU hand dataset.
Accurate detection of fingertips in depth image is critical for human-computer interaction. In this paper, we present a novel two-stream convolutional neural network (CNN) for RGB-D fingertip detection. Firstly edge image is extracted from raw depth image using random forest. Then the edge information is combined with depth information in our CNN structure. We study several fusion approaches and suggest a slow fusion strategy as a promising way of fingertip detection. As shown in our experiments, our real-time algorithm outperforms state-of-the-art fingertip detection methods on the public dataset HandNet with an average 3D error of 9.9mm, and shows comparable accuracy of fingertip estimation on NYU hand dataset.