The Gap Between Principle and Practice of Lossy Image Coding
This work addresses the performance limitations in lossy image coding for applications like compression, but it is incremental as it analyzes existing gaps rather than proposing a new method.
The paper identifies a gap between the theoretical rate-distortion bound and empirical performance in learned image coding, attributing it to five effects and quantitatively evaluating three of them to show potential for future improvements.
Lossy image coding is the art of computing that is principally bounded by the image's rate-distortion function. This bound, though never accurately characterized, has been approached practically via deep learning technologies in recent years. Indeed, learned image coding schemes allow direct optimization of the joint rate-distortion cost, thereby outperforming the handcrafted image coding schemes by a large margin. Still, it is observed that there is room for further improvement in the rate-distortion performance of learned image coding. In this article, we identify the gap between the ideal rate-distortion function forecasted by Shannon's information theory and the empirical rate-distortion function achieved by the state-of-the-art learned image coding schemes, revealing that the gap is incurred by five different effects: modeling effect, approximation effect, amortization effect, digitization effect, and asymptotic effect. We design simulations and experiments to quantitively evaluate the last three effects, which demonstrates the high potential of future lossy image coding technologies.