CVAIFeb 21, 2024

Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel

arXiv:2402.13536v12 citationsh-index: 9Tiny Papers @ ICLR
Originality Incremental advance
AI Analysis

This work addresses the challenge of compressing images to micro-bit per pixel levels for applications requiring minimal storage or bandwidth, though it is incremental in pushing existing AI models to their limits.

The paper tackled the problem of image compression at extremely low bitrates by using GPT-4V and DALL-E3 to explore semantic compression, achieving compression as low as 100 μbpp, which is up to 10,000 times smaller than JPEG, and hypothesizing this as a soft limit for current technology.

Traditional methods, such as JPEG, perform image compression by operating on structural information, such as pixel values or frequency content. These methods are effective to bitrates around one bit per pixel (bpp) and higher at standard image sizes. In contrast, text-based semantic compression directly stores concepts and their relationships using natural language, which has evolved with humans to efficiently represent these salient concepts. These methods can operate at extremely low bitrates by disregarding structural information like location, size, and orientation. In this work, we use GPT-4V and DALL-E3 from OpenAI to explore the quality-compression frontier for image compression and identify the limitations of current technology. We push semantic compression as low as 100 $μ$bpp (up to $10,000\times$ smaller than JPEG) by introducing an iterative reflection process to improve the decoded image. We further hypothesize this 100 $μ$bpp level represents a soft limit on semantic compression at standard image resolutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes