End-to-end optimization of nonlinear transform codes for perceptual quality
This work addresses image compression quality for applications requiring perceptual fidelity, representing an incremental advancement over existing methods.
The authors tackled the problem of optimizing nonlinear transform codes for perceptual quality, introducing an end-to-end framework that improves bitrate and perceptual appearance over fixed DCT codes and linear transform codes optimized for mean squared error.
We introduce a general framework for end-to-end optimization of the rate--distortion performance of nonlinear transform codes assuming scalar quantization. The framework can be used to optimize any differentiable pair of analysis and synthesis transforms in combination with any differentiable perceptual metric. As an example, we consider a code built from a linear transform followed by a form of multi-dimensional local gain control. Distortion is measured with a state-of-the-art perceptual metric. When optimized over a large database of images, this representation offers substantial improvements in bitrate and perceptual appearance over fixed (DCT) codes, and over linear transform codes optimized for mean squared error.