Hand Action Detection from Ego-centric Depth Sequences with Error-correcting Hough Transform
This work addresses a challenging problem in computer vision for applications like human-computer interaction, but it is incremental as it builds on existing Hough transform methods with a specific correction component.
The paper tackles hand action detection from ego-centric depth sequences by introducing a Hough transform approach with an error-correcting component to address incorrect votes, achieving satisfactory results on a new dataset of 300 videos with 3,177 subsequences across 16 action classes.
Detecting hand actions from ego-centric depth sequences is a practically challenging problem, owing mostly to the complex and dexterous nature of hand articulations as well as non-stationary camera motion. We address this problem via a Hough transform based approach coupled with a discriminatively learned error-correcting component to tackle the well known issue of incorrect votes from the Hough transform. In this framework, local parts vote collectively for the start $\&$ end positions of each action over time. We also construct an in-house annotated dataset of 300 long videos, containing 3,177 single-action subsequences over 16 action classes collected from 26 individuals. Our system is empirically evaluated on this real-life dataset for both the action recognition and detection tasks, and is shown to produce satisfactory results. To facilitate reproduction, the new dataset and our implementation are also provided online.