On Semantic Word Cloud Representation
This work addresses the challenge of creating more meaningful and visually coherent word clouds for data visualization applications, representing an incremental improvement over prior heuristic methods.
The authors tackled the problem of generating semantic-preserving word clouds by formalizing it as the Word Rectangle Adjacency Contact (WRAC) problem, where semantically related words are placed in touching rectangles, and they developed efficient polynomial-time algorithms, proved NP-hardness for some variants, and experimentally showed their algorithms outperform existing heuristics.
We study the problem of computing semantic-preserving word clouds in which semantically related words are close to each other. While several heuristic approaches have been described in the literature, we formalize the underlying geometric algorithm problem: Word Rectangle Adjacency Contact (WRAC). In this model each word is associated with rectangle with fixed dimensions, and the goal is to represent semantically related words by ensuring that the two corresponding rectangles touch. We design and analyze efficient polynomial-time algorithms for some variants of the WRAC problem, show that several general variants are NP-hard, and describe a number of approximation algorithms. Finally, we experimentally demonstrate that our theoretically-sound algorithms outperform the early heuristics.