CLDec 16, 2016

Automatic Labelling of Topics with Neural Embeddings

arXiv:1612.05340v270 citations
Originality Incremental advance
AI Analysis

This addresses the cognitive overhead for end-users interpreting topic models, though it is incremental as it builds on existing neural embedding methods.

The paper tackles the problem of interpreting topic model outputs by automatically labeling topics with succinct phrases using neural embeddings from Wikipedia titles, achieving better labels than a state-of-the-art system.

Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea. Using Wikipedia document titles as label candidates, we compute neural embeddings for documents and words to select the most relevant labels for topics. Compared to a state-of-the-art topic labelling system, our methodology is simpler, more efficient, and finds better topic labels.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes