CVMMIVOct 18, 2020

Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution

arXiv:2010.08919v210 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing image quality for high-level computer vision tasks like text recognition and face detection, offering an incremental improvement over existing methods.

The paper tackles the problem of generating high-quality, high-resolution images from compressed low-resolution inputs by jointly addressing compression artifacts reduction and super-resolution, resulting in a model that outperforms previous methods with 26.2% shorter runtime and improves accuracy in text recognition from 85.30% to 85.75% and average precision in face detection from 0.317 to 0.611.

Due to the limits of bandwidth and storage space, digital images are usually down-scaled and compressed when transmitted over networks, resulting in loss of details and jarring artifacts that can lower the performance of high-level visual tasks. In this paper, we aim to generate an artifact-free high-resolution image from a low-resolution one compressed with an arbitrary quality factor by exploring joint compression artifacts reduction (CAR) and super-resolution (SR) tasks. First, we propose a context-aware joint CAR and SR neural network (CAJNN) that integrates both local and non-local features to solve CAR and SR in one-stage. Finally, a deep reconstruction network is adopted to predict high quality and high-resolution images. Evaluation on CAR and SR benchmark datasets shows that our CAJNN model outperforms previous methods and also takes 26.2% shorter runtime. Based on this model, we explore addressing two critical challenges in high-level computer vision: optical character recognition of low-resolution texts, and extremely tiny face detection. We demonstrate that CAJNN can serve as an effective image preprocessing method and improve the accuracy for real-scene text recognition (from 85.30% to 85.75%) and the average precision for tiny face detection (from 0.317 to 0.611).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes