IV CVJul 11, 2024

OMR-NET: a two-stage octave multi-scale residual network for screen content image compression

Shiqi Jiang, Ting Ren, Congrui Fu, Shuai Li, Hui Yuan

arXiv:2407.08545v16.37 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This work addresses image compression for screen content, which is important for applications like remote desktops and video streaming, but it is incremental as it builds on existing learned compression methods with specific adaptations.

The authors tackled the problem of compressing screen content images, which have unique characteristics like noise-free and high contrast, by proposing OMR-NET, a two-stage octave multi-scale residual network. The method outperforms existing learned image compression methods in rate-distortion performance on screen content images, as demonstrated experimentally.

Screen content (SC) differs from natural scene (NS) with unique characteristics such as noise-free, repetitive patterns, and high contrast. Aiming at addressing the inadequacies of current learned image compression (LIC) methods for SC, we propose an improved two-stage octave convolutional residual blocks (IToRB) for high and low-frequency feature extraction and a cascaded two-stage multi-scale residual blocks (CTMSRB) for improved multi-scale learning and nonlinearity in SC. Additionally, we employ a window-based attention module (WAM) to capture pixel correlations, especially for high contrast regions in the image. We also construct a diverse SC image compression dataset (SDU-SCICD2K) for training, including text, charts, graphics, animation, movie, game and mixture of SC images and NS images. Experimental results show our method, more suited for SC than NS data, outperforms existing LIC methods in rate-distortion performance on SC images. The code is publicly available at https://github.com/SunshineSki/OMR Net.git.

View on arXiv PDF Code

Similar