CLMay 5

LLM-XTM: Enhancing Cross-Lingual Topic Models with Large Language Models

arXiv:2605.0329962.5
AI Analysis

For researchers in cross-lingual topic modeling, LLM-XTM offers a scalable and stable method to improve topic quality without requiring bilingual resources or white-box LLM access.

LLM-XTM enhances cross-lingual topic models by integrating LLM-guided topic refinement with self-consistency uncertainty quantification, achieving superior topic coherence and alignment while reducing reliance on bilingual dictionaries and expensive LLM calls.

Cross-lingual topic modeling aims to discover shared semantic structures across languages, yet existing models depend on sparse bilingual resources and often yield incoherent or weakly aligned topics. Recent LLM-based refinements improve interpretability but are costly, document-level, and prone to hallucination, with prior white-box approaches requiring inaccessible token probabilities. We propose LLM-XTM, a framework that integrates LLM-guided topic refinement with self-consistency uncertainty quantification, enabling black-box, stable, and scalable enhancement of cross-lingual topic models. Experiments on multilingual corpora show that LLM-XTM achieves superior topic coherence and alignment while reducing reliance on bilingual dictionaries and expensive LLM calls.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes