CVJul 23, 2019

Exploring Semantic Segmentation on the DCT Representation

arXiv:1907.10015v226 citations
Originality Incremental advance
AI Analysis

This addresses efficient image processing for memory-constrained applications, but it is incremental as it adapts existing networks to a new input format.

The paper tackles semantic segmentation on JPEG-compressed images by using DCT coefficients, achieving accuracy close to RGB models with similar complexity and reducing required coefficients by 64% for the same accuracy.

Typical convolutional networks are trained and conducted on RGB images. However, images are often compressed for memory savings and efficient transmission in real-world applications. In this paper, we explore methods for performing semantic segmentation on the discrete cosine transform (DCT) representation defined by the JPEG standard. We first rearrange the DCT coefficients to form a preferred input type, then we tailor an existing network to the DCT inputs. The proposed method has an accuracy close to the RGB model at about the same network complexity. Moreover, we investigate the impact of selecting different DCT components on segmentation performance. With a proper selection, one can achieve the same level accuracy using only 36% of the DCT coefficients. We further show the robustness of our method under the quantization errors. To our knowledge, this paper is the first to explore semantic segmentation on the DCT representation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes