CVAILGMay 28, 2025

PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion

arXiv:2505.22564v11 citationsh-index: 5
Originality Highly original
AI Analysis

This addresses computational challenges in large-scale video data processing for deep learning applications, representing a novel method for a known bottleneck.

The paper tackles video dataset condensation by introducing PRISM, which progressively refines and inserts frames to preserve spatial-temporal interdependence, outperforming existing methods on benchmarks with better performance and less storage.

Video dataset condensation has emerged as a critical technique for addressing the computational challenges associated with large-scale video data processing in deep learning applications. While significant progress has been made in image dataset condensation, the video domain presents unique challenges due to the complex interplay between spatial content and temporal dynamics. This paper introduces PRISM, Progressive Refinement and Insertion for Sparse Motion, for video dataset condensation, a novel approach that fundamentally reconsiders how video data should be condensed. Unlike the previous method that separates static content from dynamic motion, our method preserves the essential interdependence between these elements. Our approach progressively refines and inserts frames to fully accommodate the motion in an action while achieving better performance but less storage, considering the relation of gradients for each frame. Extensive experiments across standard video action recognition benchmarks demonstrate that PRISM outperforms existing disentangled approaches while maintaining compact representations suitable for resource-constrained environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes