Multi-View Surveillance Video Summarization via Joint Embedding and Sparse Optimization
This addresses the challenge of efficiently summarizing multi-view videos in uncalibrated camera networks, which is incremental as it builds on existing video summarization methods by extending them to multi-view scenarios.
The paper tackles the problem of summarizing multi-view surveillance videos without requiring prior alignment between cameras, proposing an unsupervised framework that jointly optimizes embedding for correlations and sparse representative selection, and demonstrates clear outperformance over state-of-the-art methods in experiments on multiple datasets.
Most traditional video summarization methods are designed to generate effective summaries for single-view videos, and thus they cannot fully exploit the complicated intra and inter-view correlations in summarizing multi-view videos in a camera network. In this paper, with the aim of summarizing multi-view videos, we introduce a novel unsupervised framework via joint embedding and sparse representative selection. The objective function is two-fold. The first is to capture the multi-view correlations via an embedding, which helps in extracting a diverse set of representatives. The second is to use a `2;1- norm to model the sparsity while selecting representative shots for the summary. We propose to jointly optimize both of the objectives, such that embedding can not only characterize the correlations, but also indicate the requirements of sparse representative selection. We present an efficient alternating algorithm based on half-quadratic minimization to solve the proposed non-smooth and non-convex objective with convergence analysis. A key advantage of the proposed approach with respect to the state-of-the-art is that it can summarize multi-view videos without assuming any prior correspondences/alignment between them, e.g., uncalibrated camera networks. Rigorous experiments on several multi-view datasets demonstrate that our approach clearly outperforms the state-of-the-art methods.