Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations
This work addresses visual prediction of human-object interactions for video analysis applications, representing an incremental advancement in graph-based methods.
The paper tackles the problem of predicting human-object interactions in videos, specifically forecasting the ongoing interaction class and the next active object(s) with timing, achieving high prediction accuracy on MSR Daily Activities and CAD120 datasets.
We present a novel approach for the visual prediction of human-object interactions in videos. Rather than forecasting the human and object motion or the future hand-object contact points, we aim at predicting (a)the class of the on-going human-object interaction and (b) the class(es) of the next active object(s) (NAOs), i.e., the object(s) that will be involved in the interaction in the near future as well as the time the interaction will occur. Graph matching relies on the efficient Graph Edit distance (GED) method. The experimental evaluation of the proposed approach was conducted using two well-established video datasets that contain human-object interactions, namely the MSR Daily Activities and the CAD120. High prediction accuracy was obtained for both action prediction and NAO forecasting.