CL AIOct 3, 2025

Topic Modeling as Long-Form Generation: Can Long-Context LLMs revolutionize NTM via Zero-Shot Prompting?

Xuan Xu, Haolun Li, Zhongliang Yang, Beilin Chu, Jia Song, Moxuan Xu, Linna Zhou

arXiv:2510.03174v12.7h-index: 4

Originality Highly original

AI Analysis

This work addresses the potential obsolescence of traditional topic modeling methods for researchers and practitioners in NLP.

The paper investigates whether large language models can outperform neural topic models by framing topic modeling as a long-form generation task, finding that LLMs achieve competitive or superior topic quality in zero-shot settings.

Traditional topic models such as neural topic models rely on inference and generation networks to learn latent topic distributions. This paper explores a new paradigm for topic modeling in the era of large language models, framing TM as a long-form generation task whose definition is updated in this paradigm. We propose a simple but practical approach to implement LLM-based topic model tasks out of the box (sample a data subset, generate topics and representative text with our prompt, text assignment with keyword match). We then investigate whether the long-form generation paradigm can beat NTMs via zero-shot prompting. We conduct a systematic comparison between NTMs and LLMs in terms of topic quality and empirically examine the claim that "a majority of NTMs are outdated."

View on arXiv PDF

Similar