In-Context Clustering with Large Language Models
This work addresses clustering problems for researchers and practitioners by extending in-context learning to unsupervised tasks, offering flexibility for text-conditioned clustering, though it is incremental as it adapts existing LLM techniques.
The paper tackles clustering data from diverse distributions by proposing In-Context Clustering (ICC), a method using large language models (LLMs) to capture complex relationships through attention mechanisms, achieving competitive performance in zero-shot and fine-tuned settings on text, numeric, and image data.
We propose In-Context Clustering (ICC), a flexible LLM-based procedure for clustering data from diverse distributions. Unlike traditional clustering algorithms constrained by predefined similarity measures, ICC flexibly captures complex relationships among inputs through an attention mechanism. We show that pretrained LLMs exhibit impressive zero-shot clustering capabilities on text-encoded numeric data, with attention matrices showing salient cluster patterns. Spectral clustering using attention matrices offers surprisingly competitive performance. We further enhance the clustering capabilities of LLMs on numeric and image data through fine-tuning using the Next Token Prediction (NTP) loss. Moreover, the flexibility of LLM prompting enables text-conditioned image clustering, a capability that classical clustering methods lack. Our work extends in-context learning to an unsupervised setting, showcasing the effectiveness and flexibility of LLMs for clustering. Our code is available at https://agenticlearning.ai/icc.