LGApr 1, 2025

Input Resolution Downsizing as a Compression Technique for Vision Deep Learning Systems

Jeremy Morlier, Mathieu Leonardon, Vincent Gripon

arXiv:2504.03749v14.12 citationsh-index: 2IJCNN

Originality Synthesis-oriented

AI Analysis

This work addresses the need to lighten models for vision applications, though it is incremental as it explores an under-explored but straightforward approach.

The paper tackles model compression in vision deep learning by exploring input resolution reduction as a complementary technique to pruning, quantization, and knowledge distillation, demonstrating competitive performance on classification and segmentation tasks while significantly reducing computational and memory requirements.

Model compression is a critical area of research in deep learning, in particular in vision, driven by the need to lighten models memory or computational footprints. While numerous methods for model compression have been proposed, most focus on pruning, quantization, or knowledge distillation. In this work, we delve into an under-explored avenue: reducing the resolution of the input image as a complementary approach to other types of compression. By systematically investigating the impact of input resolution reduction, on both tasks of classification and semantic segmentation, and on convnets and transformer-based architectures, we demonstrate that this strategy provides an interesting alternative for model compression. Our experimental results on standard benchmarks highlight the potential of this method, achieving competitive performance while significantly reducing computational and memory requirements. This study establishes input resolution reduction as a viable and promising direction in the broader landscape of model compression techniques for vision applications.

View on arXiv PDF

Similar