CVSep 16, 2023

FrameRS: A Video Frame Compression Model Composed by Self supervised Video Frame Reconstructor and Key Frame Selector

arXiv:2309.09083v1h-index: 27Has Code
Originality Incremental advance
AI Analysis

This work addresses video compression for applications requiring reduced storage or bandwidth, but it appears incremental as it adapts existing methods like MAE to video.

The paper tackles video compression by proposing FrameRS, a model that combines a self-supervised frame reconstructor (FrameMAE) and a key frame selector to compress video clips by retaining about 30% of key frames, achieving computational efficiency and competitive accuracy.

In this paper, we present frame reconstruction model: FrameRS. It consists self-supervised video frame reconstructor and key frame selector. The frame reconstructor, FrameMAE, is developed by adapting the principles of the Masked Autoencoder for Images (MAE) for video context. The key frame selector, Frame Selector, is built on CNN architecture. By taking the high-level semantic information from the encoder of FrameMAE as its input, it can predicted the key frames with low computation costs. Integrated with our bespoke Frame Selector, FrameMAE can effectively compress a video clip by retaining approximately 30% of its pivotal frames. Performance-wise, our model showcases computational efficiency and competitive accuracy, marking a notable improvement over traditional Key Frame Extract algorithms. The implementation is available on Github

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes