CVAILGMay 19, 2024

Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

arXiv:2405.11574v1h-index: 10Has Code
Originality Synthesis-oriented
AI Analysis

It addresses reproducibility for researchers in computer vision, but is incremental as it replicates existing work.

This study reproduces the CDUL method for unsupervised multi-label image classification, verifying its CLIP-based pseudo-label initialization and gradient-alignment training, but does not report new performance numbers.

This report is a reproducibility study of the paper "CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification" (Abdelfattah et al, ICCV 2023). Our report makes the following contributions: (1) We provide a reproducible, well commented and open-sourced code implementation for the entire method specified in the original paper. (2) We try to verify the effectiveness of the novel aggregation strategy which uses the CLIP model to initialize the pseudo labels for the subsequent unsupervised multi-label image classification task. (3) We try to verify the effectiveness of the gradient-alignment training method specified in the original paper, which is used to update the network parameters and pseudo labels. The code can be found at https://github.com/cs-mshah/CDUL

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes