SCENE: Semantic-aware Codec Enhancement with Neural Embeddings

arXiv:2601.22189v1h-index: 7

Originality Incremental advance

AI Analysis

This addresses the problem of degraded video quality for users of compressed video streams, but it is incremental as it builds on existing pre-processing and semantic embedding techniques.

The paper tackles the problem of perceptual quality degradation from compression artifacts in standard video codecs by proposing SCENE, a lightweight semantic-aware pre-processing framework that enhances perceptual fidelity. Results show improved performance in objective (MS-SSIM) and perceptual (VMAF) metrics on high-resolution benchmarks, with notable gains in preserving detailed textures in salient regions.

Compression artifacts from standard video codecs often degrade perceptual quality. We propose a lightweight, semantic-aware pre-processing framework that enhances perceptual fidelity by selectively addressing these distortions. Our method integrates semantic embeddings from a vision-language model into an efficient convolutional architecture, prioritizing the preservation of perceptually significant structures. The model is trained end-to-end with a differentiable codec proxy, enabling it to mitigate artifacts from various standard codecs without modifying the existing video pipeline. During inference, the codec proxy is discarded, and SCENE operates as a standalone pre-processor, enabling real-time performance. Experiments on high-resolution benchmarks show improved performance over baselines in both objective (MS-SSIM) and perceptual (VMAF) metrics, with notable gains in preserving detailed textures within salient regions. Our results show that semantic-guided, codec-aware pre-processing is an effective approach for enhancing compressed video streams.

View on arXiv PDF

Similar