PRISM: Perceptual Recognition for Identifying Standout Moments in Human-Centric Keyframe Extraction
This addresses the need for efficient video analysis tools to combat misinformation and radicalization in online platforms, though it is incremental as it builds on existing keyframe extraction methods with a perceptual focus.
The paper tackles the problem of identifying standout moments in online videos for content moderation and summarization by introducing PRISM, a lightweight, interpretable framework that uses perceptual color metrics in CIELAB space, achieving strong accuracy and high compression ratios on benchmark datasets like BBC and TVSum.
Online videos play a central role in shaping political discourse and amplifying cyber social threats such as misinformation, propaganda, and radicalization. Detecting the most impactful or "standout" moments in video content is crucial for content moderation, summarization, and forensic analysis. In this paper, we introduce PRISM (Perceptual Recognition for Identifying Standout Moments), a lightweight and perceptually-aligned framework for keyframe extraction. PRISM operates in the CIELAB color space and uses perceptual color difference metrics to identify frames that align with human visual sensitivity. Unlike deep learning-based approaches, PRISM is interpretable, training-free, and computationally efficient, making it well suited for real-time and resource-constrained environments. We evaluate PRISM on four benchmark datasets: BBC, TVSum, SumMe, and ClipShots, and demonstrate that it achieves strong accuracy and fidelity while maintaining high compression ratios. These results highlight PRISM's effectiveness in both structured and unstructured video content, and its potential as a scalable tool for analyzing and moderating harmful or politically sensitive media in online platforms.