CVCLCYIRSep 12, 2023

Human Action Co-occurrence in Lifestyle Vlogs using Graph Link Prediction

arXiv:2309.06219v3h-index: 50Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the need for understanding human action interactions in lifestyle vlogs, but it is incremental as it applies existing graph methods to a new dataset.

The paper tackles the problem of automatically identifying whether two human actions can co-occur in the same time interval by introducing the ACE dataset with ~12k co-occurring pairs and graph link prediction models, showing that graph representations effectively capture action relations across domains.

We introduce the task of automatic human action co-occurrence identification, i.e., determine whether two human actions can co-occur in the same interval of time. We create and make publicly available the ACE (Action Co-occurrencE) dataset, consisting of a large graph of ~12k co-occurring pairs of visual actions and their corresponding video clips. We describe graph link prediction models that leverage visual and textual information to automatically infer if two actions are co-occurring. We show that graphs are particularly well suited to capture relations between human actions, and the learned graph representations are effective for our task and capture novel and relevant information across different data domains. The ACE dataset and the code introduced in this paper are publicly available at https://github.com/MichiganNLP/vlog_action_co-occurrence.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes