UC Merced Submission to the ActivityNet Challenge 2016
This work addresses action classification in untrimmed videos for computer vision researchers, but it is incremental as it combines existing methods without introducing new techniques.
The authors tackled the problem of action recognition in long, untrimmed videos for the ActivityNet Challenge 2016 by investigating multiple state-of-the-art approaches, resulting in a system that fuses features from hand-crafted and deep networks with SVM classifiers and ResNet-101.
This notebook paper describes our system for the untrimmed classification task in the ActivityNet challenge 2016. We investigate multiple state-of-the-art approaches for action recognition in long, untrimmed videos. We exploit hand-crafted motion boundary histogram features as well feature activations from deep networks such as VGG16, GoogLeNet, and C3D. These features are separately fed to linear, one-versus-rest support vector machine classifiers to produce confidence scores for each action class. These predictions are then fused along with the softmax scores of the recent ultra-deep ResNet-101 using weighted averaging.