AIFeb 2

Large Language Model and Formal Concept Analysis: a comparative study for Topic Modeling

Fabrice Boissier, Monica Sen, Irina Rychkova

arXiv:2602.01933v1h-index: 12

Originality Synthesis-oriented

AI Analysis

This work addresses the need to evaluate modern LLMs against traditional methods like FCA for topic modeling, but it is incremental as it primarily compares existing techniques without introducing new algorithms.

The authors compared Large Language Models (LLM) and Formal Concept Analysis (FCA) for topic modeling, finding that both methods effectively extracted topics from documents, with GPT-5 achieving competitive results in a zero-shot setup on datasets including 40 research articles.

Topic modeling is a research field finding increasing applications: historically from document retrieving, to sentiment analysis and text summarization. Large Language Models (LLM) are currently a major trend in text processing, but few works study their usefulness for this task. Formal Concept Analysis (FCA) has recently been presented as a candidate for topic modeling, but no real applied case study has been conducted. In this work, we compare LLM and FCA to better understand their strengths and weakneses in the topic modeling field. FCA is evaluated through the CREA pipeline used in past experiments on topic modeling and visualization, whereas GPT-5 is used for the LLM. A strategy based on three prompts is applied with GPT-5 in a zero-shot setup: topic generation from document batches, merging of batch results into final topics, and topic labeling. A first experiment reuses the teaching materials previously used to evaluate CREA, while a second experiment analyzes 40 research articles in information systems to compare the extracted topics with the underling subfields.

View on arXiv PDF

Similar