Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients
This work addresses action recognition in videos, offering a simpler alternative to complex methods like HMMs, but it is incremental as it builds on existing datasets and benchmarks.
The paper tackles multi-action recognition by proposing a joint segmentation and classification method using Gaussian mixtures on low-dimensional features, achieving 78.3% accuracy on a stitched KTH dataset, outperforming an HMM-based approach at 71.2%.
In this paper we propose a novel approach to multi-action recognition that performs joint segmentation and classification. This approach models each action using a Gaussian mixture using robust low-dimensional action features. Segmentation is achieved by performing classification on overlapping temporal windows, which are then merged to produce the final result. This approach is considerably less complicated than previous methods which use dynamic programming or computationally expensive hidden Markov models (HMMs). Initial experiments on a stitched version of the KTH dataset show that the proposed approach achieves an accuracy of 78.3%, outperforming a recent HMM-based approach which obtained 71.2%.