CVAIMay 28, 2025

EdgeVidSum: Real-Time Personalized Video Summarization at the Edge

arXiv:2506.03171v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the problem of computational efficiency, personalization, and privacy for users consuming long-form videos on edge devices, though it appears incremental as it builds on existing neural methods with optimizations.

EdgeVidSum tackles real-time personalized video summarization on edge devices by using thumbnail-based techniques to reduce computational complexity, achieving efficient processing on resource-constrained hardware like Jetson Nano.

EdgeVidSum is a lightweight method that generates personalized, fast-forward summaries of long-form videos directly on edge devices. The proposed approach enables real-time video summarization while safeguarding user privacy through local data processing using innovative thumbnail-based techniques and efficient neural architectures. Unlike conventional methods that process entire videos frame by frame, the proposed method uses thumbnail containers to significantly reduce computational complexity without sacrificing semantic relevance. The framework employs a hierarchical analysis approach, where a lightweight 2D CNN model identifies user-preferred content from thumbnails and generates timestamps to create fast-forward summaries. Our interactive demo highlights the system's ability to create tailored video summaries for long-form videos, such as movies, sports events, and TV shows, based on individual user preferences. The entire computation occurs seamlessly on resource-constrained devices like Jetson Nano, demonstrating how EdgeVidSum addresses the critical challenges of computational efficiency, personalization, and privacy in modern video consumption environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes