CLDec 16, 2016

Automatic Labelling of Topics with Neural Embeddings

Shraey Bhatia, Jey Han Lau, Timothy Baldwin

arXiv:1612.05340v215.270 citations

Originality Incremental advance

AI Analysis

This addresses the cognitive overhead for end-users interpreting topic models, though it is incremental as it builds on existing neural embedding methods.

The paper tackles the problem of interpreting topic model outputs by automatically labeling topics with succinct phrases using neural embeddings from Wikipedia titles, achieving better labels than a state-of-the-art system.

Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea. Using Wikipedia document titles as label candidates, we compute neural embeddings for documents and words to select the most relevant labels for topics. Compared to a state-of-the-art topic labelling system, our methodology is simpler, more efficient, and finds better topic labels.

View on arXiv PDF

Similar