CV LG IVAug 9, 2021

AutoVideo: An Automated Video Action Recognition System

Daochen Zha, Zaid Pervaiz Bhat, Yi-Wei Chen, Yicheng Wang, Sirui Ding, Jiaben Chen, Kwei-Herng Lai, Mohammad Qazim Bhat, Anmoll Kumar Jain, Alfredo Costilla Reyes, Na Zou, Xia Hu

arXiv:2108.04212v47.311 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This system addresses the challenge for researchers and practitioners who need to build and test action recognition solutions without extensive manual tuning, though it is incremental as it automates existing methods rather than introducing new paradigms.

The paper tackles the problem of reducing engineering efforts in video action recognition by introducing AutoVideo, an automated system that provides a modular infrastructure, exhaustive primitives, data-driven tuners, and a GUI, resulting in a publicly released tool under MIT license.

Action recognition is an important task for video understanding with broad applications. However, developing an effective action recognition solution often requires extensive engineering efforts in building and testing different combinations of the modules and their hyperparameters. In this demo, we present AutoVideo, a Python system for automated video action recognition. AutoVideo is featured for 1) highly modular and extendable infrastructure following the standard pipeline language, 2) an exhaustive list of primitives for pipeline construction, 3) data-driven tuners to save the efforts of pipeline tuning, and 4) easy-to-use Graphical User Interface (GUI). AutoVideo is released under MIT license at https://github.com/datamllab/autovideo

View on arXiv PDF Code

Similar