CVMar 9, 2025

CLICv2: Image Complexity Representation via Content Invariance Contrastive Learning

arXiv:2503.06641v13 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of content-invariant complexity representation in computer vision, which is important for applications like image quality assessment, but it appears to be an incremental improvement over the previous CLIC method.

The paper tackles the problem of unsupervised image complexity representation by proposing CLICv2, a contrastive learning framework that enforces content invariance to address bias in positive sample selection and sensitivity to image content. The method significantly outperforms existing unsupervised methods on the IC9600 dataset in terms of PCC and SRCC metrics.

Unsupervised image complexity representation often suffers from bias in positive sample selection and sensitivity to image content. We propose CLICv2, a contrastive learning framework that enforces content invariance for complexity representation. Unlike CLIC, which generates positive samples via cropping-introducing positive pairs bias-our shifted patchify method applies randomized directional shifts to image patches before contrastive learning. Patches at corresponding positions serve as positive pairs, ensuring content-invariant learning. Additionally, we propose patch-wise contrastive loss, which enhances local complexity representation while mitigating content interference. In order to further suppress the interference of image content, we introduce Masked Image Modeling as an auxiliary task, but we set its modeling objective as the entropy of masked patches, which recovers the entropy of the overall image by using the information of the unmasked patches, and then obtains the global complexity perception ability. Extensive experiments on IC9600 demonstrate that CLICv2 significantly outperforms existing unsupervised methods in PCC and SRCC, achieving content-invariant complexity representation without introducing positive pairs bias.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes