CLAIJun 19, 2024

ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models

arXiv:2406.13342v21 citations
Originality Incremental advance
AI Analysis

This addresses a limitation in LLM usability for tasks like text clustering, though it appears incremental as it builds on existing zero-shot and aggregation techniques.

The paper tackles the problem of LLMs failing on tasks that cannot be fully described in prompts by proposing a method to contextualize tasks using zero-shot inference and aggregated meta-information, resulting in improvements in text clustering tasks on several datasets.

The advancements in large language models (LLMs) have brought significant progress in NLP tasks. However, if a task cannot be fully described in prompts, the models could fail to carry out the task. In this paper, we propose a simple yet effective method to contextualize a task toward a LLM. The method utilizes (1) open-ended zero-shot inference from the entire dataset, (2) aggregate the inference results, and (3) finally incorporate the aggregated meta-information for the actual task. We show the effectiveness in text clustering tasks, empowering LLMs to perform text-to-text-based clustering and leading to improvements on several datasets. Furthermore, we explore the generated class labels for clustering, showing how the LLM understands the task through data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes