CVJun 9, 2017

Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection

arXiv:1706.03038v2200 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses the problem of enabling real-world applications in aerial view action detection for researchers, but it is incremental as it focuses on data creation rather than algorithmic advancement.

The authors tackled the lack of a dataset for real-world aerial view human action detection by introducing Okutama-Action, a 43-minute video dataset with 12 action classes, which is more challenging due to dynamic transitions, scale changes, and multi-labeled actors.

Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt camera movement, as well as multi-labeled actors. As a result, our dataset is more challenging than existing ones, and will help push the field forward to enable real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes