CLNov 19, 2025

LLM-MemCluster: Empowering Large Language Models with Dynamic Memory for Text Clustering

Yuanjie Zhu, Liangwei Yang, Ke Xu, Weizhi Zhang, Zihe Song, Jindong Wang, Philip S. Yu

arXiv:2511.15424v12 citationsh-index: 16

Originality Highly original

AI Analysis

This work provides an end-to-end paradigm for LLM-based text clustering, which is incremental as it builds on existing LLM capabilities to improve clustering tasks.

The paper tackles the problem of text clustering with large language models (LLMs) by addressing limitations in stateful memory and cluster granularity management, resulting in a tuning-free framework that significantly outperforms baselines on benchmark datasets.

Large Language Models (LLMs) are reshaping unsupervised learning by offering an unprecedented ability to perform text clustering based on their deep semantic understanding. However, their direct application is fundamentally limited by a lack of stateful memory for iterative refinement and the difficulty of managing cluster granularity. As a result, existing methods often rely on complex pipelines with external modules, sacrificing a truly end-to-end approach. We introduce LLM-MemCluster, a novel framework that reconceptualizes clustering as a fully LLM-native task. It leverages a Dynamic Memory to instill state awareness and a Dual-Prompt Strategy to enable the model to reason about and determine the number of clusters. Evaluated on several benchmark datasets, our tuning-free framework significantly and consistently outperforms strong baselines. LLM-MemCluster presents an effective, interpretable, and truly end-to-end paradigm for LLM-based text clustering.

View on arXiv PDF

Similar