CVAINov 1, 2023

ChatGPT-Powered Hierarchical Comparisons for Image Classification

arXiv:2311.00206v142 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses biases in CLIP for distinct but related classes in image classification, offering an incremental improvement.

The paper tackles the zero-shot open-vocabulary challenge in image classification by proposing a framework that uses ChatGPT to recursively group classes into hierarchies and classifies images by comparing embeddings at each level, resulting in an intuitive and explainable method.

The zero-shot open-vocabulary challenge in image classification is tackled by pretrained vision-language models like CLIP, which benefit from incorporating class-specific knowledge from large language models (LLMs) like ChatGPT. However, biases in CLIP lead to similar descriptions for distinct but related classes, prompting our novel image classification framework via hierarchical comparisons: using LLMs to recursively group classes into hierarchies and classifying images by comparing image-text embeddings at each hierarchy level, resulting in an intuitive, effective, and explainable approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes