CVAIJan 22, 2022

LTC-SUM: Lightweight Client-driven Personalized Video Summarization Framework Using 2D CNN

arXiv:2201.09049v223 citations
Originality Incremental advance
AI Analysis

This addresses computational and privacy bottlenecks for resource-constrained end-user devices in video summarization, though it appears incremental as it builds on existing methods with a focus on efficiency.

The paper tackles the problem of computationally intensive video summarization by proposing a lightweight client-driven framework that uses thumbnail containers and a 2D CNN, achieving significant computational efficiency and higher user ratings in experiments on 18 feature-length videos.

This paper proposes a novel lightweight thumbnail container-based summarization (LTC-SUM) framework for full feature-length videos. This framework generates a personalized keyshot summary for concurrent users by using the computational resource of the end-user device. State-of-the-art methods that acquire and process entire video data to generate video summaries are highly computationally intensive. In this regard, the proposed LTC-SUM method uses lightweight thumbnails to handle the complex process of detecting events. This significantly reduces computational complexity and improves communication and storage efficiency by resolving computational and privacy bottlenecks in resource-constrained end-user devices. These improvements were achieved by designing a lightweight 2D CNN model to extract features from thumbnails, which helped select and retrieve only a handful of specific segments. Extensive quantitative experiments on a set of full 18 feature-length videos (approximately 32.9 h in duration) showed that the proposed method is significantly computationally efficient than state-of-the-art methods on the same end-user device configurations. Joint qualitative assessments of the results of 56 participants showed that participants gave higher ratings to the summaries generated using the proposed method. To the best of our knowledge, this is the first attempt in designing a fully client-driven personalized keyshot video summarization framework using thumbnail containers for feature-length videos.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes