CL LGJun 8, 2023

A modified model for topic detection from a corpus and a new metric evaluating the understandability of topics

Tomoya Kitano, Yuto Miyatake, Daisuke Furihata

arXiv:2306.04941v10.51 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of improving topic modeling and evaluation for researchers and practitioners in natural language processing, but it appears incremental as it modifies an existing model and metric.

The paper tackles topic detection from text corpora by proposing a modified neural model that builds upon the embedded topic model with document clustering, showing favorable performance regardless of document length, and introduces a new metric for evaluating topic understandability that is more efficient than existing metrics like topic coherence.

This paper presents a modified neural model for topic detection from a corpus and proposes a new metric to evaluate the detected topics. The new model builds upon the embedded topic model incorporating some modifications such as document clustering. Numerical experiments suggest that the new model performs favourably regardless of the document's length. The new metric, which can be computed more efficiently than widely-used metrics such as topic coherence, provides variable information regarding the understandability of the detected topics.

View on arXiv PDF

Similar