CVJul 17, 2020

SumGraph: Video Summarization via Recursive Graph Modeling

arXiv:2007.08809v170 citations
AI Analysis

This addresses the problem of efficiently selecting keyframes for video summarization, offering a novel approach that improves accuracy for applications like video indexing and browsing, though it is incremental as it builds on existing graph-based methods.

The paper tackled video summarization by framing it as a graph modeling problem, proposing SumGraph to recursively refine semantic relationships among frames, achieving state-of-the-art performance on multiple benchmarks in supervised and unsupervised settings.

The goal of video summarization is to select keyframes that are visually diverse and can represent a whole story of an input video. State-of-the-art approaches for video summarization have mostly regarded the task as a frame-wise keyframe selection problem by aggregating all frames with equal weight. However, to find informative parts of the video, it is necessary to consider how all the frames of the video are related to each other. To this end, we cast video summarization as a graph modeling problem. We propose recursive graph modeling networks for video summarization, termed SumGraph, to represent a relation graph, where frames are regarded as nodes and nodes are connected by semantic relationships among frames. Our networks accomplish this through a recursive approach to refine an initially estimated graph to correctly classify each node as a keyframe by reasoning the graph representation via graph convolutional networks. To leverage SumGraph in a more practical environment, we also present a way to adapt our graph modeling in an unsupervised fashion. With SumGraph, we achieved state-of-the-art performance on several benchmarks for video summarization in both supervised and unsupervised manners.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes