CVLGPFAug 20, 2020

Accuracy and Performance Comparison of Video Action Recognition Approaches

arXiv:2008.09037v15 citations
Originality Synthesis-oriented
AI Analysis

This provides a standardized benchmark for researchers and practitioners in video action recognition, though it is incremental as it focuses on comparison rather than new methods.

The paper tackled the problem of inconsistent comparisons in video action recognition by directly evaluating fourteen models under uniform conditions, reporting Top-1 and Top-5 accuracy metrics along with computational performance on up to 64 GPUs.

Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes