Simultaneous prediction of hand gestures, handedness, and hand keypoints using thermal images
This addresses hand-based human-computer interaction for users in thermal imaging contexts, but is incremental as it applies known multi-task learning to a new modality.
The paper tackles simultaneous hand gesture classification, handedness detection, and hand keypoint localization using thermal images, achieving over 98% accuracy for gestures, handedness, and fingertips, and over 91% for wrist points.
Hand gesture detection is a well-explored area in computer vision with applications in various forms of Human-Computer Interactions. In this work, we propose a technique for simultaneous hand gesture classification, handedness detection, and hand keypoints localization using thermal data captured by an infrared camera. Our method uses a novel deep multi-task learning architecture that includes shared encoderdecoder layers followed by three branches dedicated for each mentioned task. We performed extensive experimental validation of our model on an in-house dataset consisting of 24 users data. The results confirm higher than 98 percent accuracy for gesture classification, handedness detection, and fingertips localization, and more than 91 percent accuracy for wrist points localization.