Joint Hand Detection and Rotation Estimation by Using CNN
This work addresses hand detection for robotics and human-computer interaction, but it is incremental as it builds on existing detection models with specific improvements.
The paper tackles hand detection and in-plane rotation estimation in uncontrolled environments by proposing a deep learning approach with a context-aware proposal generation algorithm and a CNN that jointly handles detection and rotation. It achieves better results than state-of-the-art models on benchmarks like Oxford and Egohands databases, showing mutual benefits between rotation estimation and classification.
Hand detection is essential for many hand related tasks, e.g. parsing hand pose, understanding gesture, which are extremely useful for robotics and human-computer interaction. However, hand detection in uncontrolled environments is challenging due to the flexibility of wrist joint and cluttered background. We propose a deep learning based approach which detects hands and calibrates in-plane rotation under supervision at the same time. To guarantee the recall, we propose a context aware proposal generation algorithm which significantly outperforms the selective search. We then design a convolutional neural network(CNN) which handles object rotation explicitly to jointly solve the object detection and rotation estimation tasks. Experiments show that our method achieves better results than state-of-the-art detection models on widely-used benchmarks such as Oxford and Egohands database. We further show that rotation estimation and classification can mutually benefit each other.