A Graph-Based Framework to Bridge Movies and Synopses
This work addresses movie understanding for researchers in video analytics, but it is incremental as it builds on existing datasets and methods for video-text matching.
The authors tackled the problem of matching movie segments to synopsis paragraphs by constructing the Movie Synopses Associations dataset with 327 movies and developing a graph-based framework that integrates event dynamics and character interactions. Their framework significantly improves matching accuracy over conventional methods, though specific numbers are not provided.
Inspired by the remarkable advances in video analytics, research teams are stepping towards a greater ambition -- movie understanding. However, compared to those activity videos in conventional datasets, movies are significantly different. Generally, movies are much longer and consist of much richer temporal structures. More importantly, the interactions among characters play a central role in expressing the underlying story. To facilitate the efforts along this direction, we construct a dataset called Movie Synopses Associations (MSA) over 327 movies, which provides a synopsis for each movie, together with annotated associations between synopsis paragraphs and movie segments. On top of this dataset, we develop a framework to perform matching between movie segments and synopsis paragraphs. This framework integrates different aspects of a movie, including event dynamics and character interactions, and allows them to be matched with parsed paragraphs, based on a graph-based formulation. Our study shows that the proposed framework remarkably improves the matching accuracy over conventional feature-based methods. It also reveals the importance of narrative structures and character interactions in movie understanding.