CVJul 14, 2022

Deep Image Clustering with Contrastive Learning and Multi-scale Graph Convolutional Networks

arXiv:2207.07173v338 citationsh-index: 64Has Code
Originality Incremental advance
AI Analysis

It addresses a gap in deep clustering for image data by combining CNN and GCN with contrastive learning, offering an incremental improvement over existing approaches.

This paper tackles the problem of deep image clustering by proposing IcicleGCN, a method that integrates contrastive learning and multi-scale graph convolutional networks to unify representation and structure learning, achieving superior clustering performance on multiple image datasets compared to state-of-the-art methods.

Deep clustering has shown its promising capability in joint representation learning and clustering via deep neural networks. Despite the significant progress, the existing deep clustering works mostly utilize some distribution-based clustering loss, lacking the ability to unify representation learning and multi-scale structure learning. To address this, this paper presents a new deep clustering approach termed image clustering with contrastive learning and multi-scale graph convolutional networks (IcicleGCN), which bridges the gap between convolutional neural network (CNN) and graph convolutional network (GCN) as well as the gap between contrastive learning and multi-scale structure learning for the deep clustering task. Our framework consists of four main modules, namely, the CNN-based backbone, the Instance Similarity Module (ISM), the Joint Cluster Structure Learning and Instance reconstruction Module (JC-SLIM), and the Multi-scale GCN module (M-GCN). Specifically, the backbone network with two weight-sharing views is utilized to learn the representations for the two augmented samples (from each image). The learned representations are then fed to ISM and JC-SLIM for joint instance-level and cluster-level contrastive learning, respectively, during which an auto-encoder in JC-SLIM is also pretrained to serve as a bridge to the M-GCN module. Further, to enforce multi-scale neighborhood structure learning, two streams of GCNs and the auto-encoder are simultaneously trained via (i) the layer-wise interaction with representation fusion and (ii) the joint self-adaptive learning. Experiments on multiple image datasets demonstrate the superior clustering performance of IcicleGCN over the state-of-the-art. The code is available at https://github.com/xuyuankun631/IcicleGCN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes