CVSep 29, 2023

Telling Stories for Common Sense Zero-Shot Action Recognition

arXiv:2309.17327v25 citationsh-index: 17Has Code
Originality Incremental advance
AI Analysis

This addresses the reliance on large labeled datasets for video understanding, offering a resource to overcome data bottlenecks in zero-shot action recognition, though it is incremental as it builds on existing language modeling progress.

The authors tackled the problem of zero-shot action recognition by introducing a new dataset, Stories, with rich textual descriptions for action classes, and a method that uses this dataset to improve feature generation, achieving state-of-the-art results with up to 6.1% top-1 accuracy improvement on benchmarks without fine-tuning.

Video understanding has long suffered from reliance on large labeled datasets, motivating research into zero-shot learning. Recent progress in language modeling presents opportunities to advance zero-shot video analysis, but constructing an effective semantic space relating action classes remains challenging. We address this by introducing a novel dataset, Stories, which contains rich textual descriptions for diverse action classes extracted from WikiHow articles. For each class, we extract multi-sentence narratives detailing the necessary steps, scenes, objects, and verbs that characterize the action. This contextual data enables modeling of nuanced relationships between actions, paving the way for zero-shot transfer. We also propose an approach that harnesses Stories to improve feature generation for training zero-shot classification. Without any target dataset fine-tuning, our method achieves new state-of-the-art on multiple benchmarks, improving top-1 accuracy by up to 6.1%. We believe Stories provides a valuable resource that can catalyze progress in zero-shot action recognition. The textual narratives forge connections between seen and unseen classes, overcoming the bottleneck of labeled data that has long impeded advancements in this exciting domain. The data can be found here: https://github.com/kini5gowda/Stories .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes