CV LG PFAug 20, 2020

Accuracy and Performance Comparison of Video Action Recognition Approaches

Matthew Hutchinson, Siddharth Samsi, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Micheal Houle, Matthew Hubbell, Micheal Jones, Jeremy Kepner, Andrew Kirby, Peter Michaleas

arXiv:2008.09037v15 citations

Originality Synthesis-oriented

AI Analysis

This provides a standardized benchmark for researchers and practitioners in video action recognition, though it is incremental as it focuses on comparison rather than new methods.

The paper tackled the problem of inconsistent comparisons in video action recognition by directly evaluating fourteen models under uniform conditions, reporting Top-1 and Top-5 accuracy metrics along with computational performance on up to 64 GPUs.

Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.

View on arXiv PDF

Similar