Deep video gesture recognition using illumination invariants
This work addresses gesture recognition in videos, which is important for applications like human-computer interaction, but it appears incremental as it builds on existing deep learning methods with specific improvements.
The paper tackles gesture recognition in videos by developing deep neural network architectures that are invariant to local scaling and lighting conditions, achieving superior results over existing methods including recent neural network approaches.
In this paper we present architectures based on deep neural nets for gesture recognition in videos, which are invariant to local scaling. We amalgamate autoencoder and predictor architectures using an adaptive weighting scheme coping with a reduced size labeled dataset, while enriching our models from enormous unlabeled sets. We further improve robustness to lighting conditions by introducing a new adaptive filer based on temporal local scale normalization. We provide superior results over known methods, including recent reported approaches based on neural nets.