CLIROct 16, 2022

Coordinated Topic Modeling

arXiv:2210.08559v2292 citationsh-index: 48
Originality Incremental advance
AI Analysis

This provides more interpretable corpus representations for NLP researchers and practitioners, though it appears to be an incremental improvement over existing topic modeling approaches.

The paper tackles the problem of making topic modeling more interpretable by introducing coordinated topic modeling, which uses predefined topics as semantic axes to represent text corpora. Their ECTM model with topic/document-level supervision and self-training outperformed baselines across multiple domains.

We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus. It considers a set of well-defined topics like the axes of a semantic space with a reference representation. It then uses the axes to model a corpus for easily understandable representation. This new task helps represent a corpus more interpretably by reusing existing knowledge and benefits the corpora comparison task. We design ECTM, an embedding-based coordinated topic model that effectively uses the reference representation to capture the target corpus-specific aspects while maintaining each topic's global semantics. In ECTM, we introduce the topic- and document-level supervision with a self-training mechanism to solve the problem. Finally, extensive experiments on multiple domains show the superiority of our model over other baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes