CVApr 8, 2019

Relational Action Forecasting

Chen Sun, Abhinav Shrivastava, Carl Vondrick, Rahul Sukthankar, Kevin Murphy, Cordelia Schmid

arXiv:1904.04231v121.386 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of predicting future actions in multi-person video scenes for applications like surveillance or human-computer interaction, representing an incremental advance with specific performance gains.

The paper tackles multi-person action forecasting in videos by jointly modeling temporal and spatial interactions among actors using a recurrent graph, achieving a significant improvement in early action classification on J-HMDB from 48% to 60%.

This paper focuses on multi-person action forecasting in videos. More precisely, given a history of H previous frames, the goal is to detect actors and to predict their future actions for the next T frames. Our approach jointly models temporal and spatial interactions among different actors by constructing a recurrent graph, using actor proposals obtained with Faster R-CNN as nodes. Our method learns to select a subset of discriminative relations without requiring explicit supervision, thus enabling us to tackle challenging visual data. We refer to our model as Discriminative Relational Recurrent Network (DRRN). Evaluation of action prediction on AVA demonstrates the effectiveness of our proposed method compared to simpler baselines. Furthermore, we significantly improve performance on the task of early action classification on J-HMDB, from the previous SOTA of 48% to 60%.

View on arXiv PDF

Similar