IRFeb 6, 2017

Document Visualization using Topic Clouds

arXiv:1702.01520v1
Originality Synthesis-oriented
AI Analysis

This provides a tool for NLP practitioners to qualitatively compare and inspect document representation algorithms, though it is incremental as it builds on existing visualization and topic modeling methods.

The paper tackles the challenge of visualizing multi-topic document representations by introducing Topic Clouds, a pie chart-based method that displays topics as slices with size-proportional importance and includes key words within each slice to aid in evaluating representation quality.

Traditionally a document is visualized by a word cloud. Recently, distributed representation methods for documents have been developed, which map a document to a set of topic embeddings. Visualizing such a representation is useful to present the semantics of a document in higher granularity; it is also challenging, as there are multiple topics, each containing multiple words. We propose to visualize a set of topics using Topic Cloud, which is a pie chart consisting of topic slices, where each slice contains important words in this topic. To make important topics/words visually prominent, the sizes of topic slices and word fonts are proportional to their importance in the document. A topic cloud can help the user quickly evaluate the quality of derived document representations. For NLP practitioners, It can be used to qualitatively compare the topic quality of different document representation algorithms, or to inspect how model parameters impact the derived representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes