Low-dimensional Embeddings for Interpretable Anchor-based Topic Inference
This addresses the issue of reduced topic quality and interpretability in anchor-based topic inference for users in natural language processing, but it is incremental as it builds on the existing anchor words algorithm.
The paper tackled the problem of poor anchor word selection in topic model inference by proposing low-dimensional embeddings to find an exact convex hull in 2- or 3-dimensional space, which improved topic quality and interpretability.
The anchor words algorithm performs provably efficient topic model inference by finding an approximate convex hull in a high-dimensional word co-occurrence space. However, the existing greedy algorithm often selects poor anchor words, reducing topic quality and interpretability. Rather than finding an approximate convex hull in a high-dimensional space, we propose to find an exact convex hull in a visualizable 2- or 3-dimensional space. Such low-dimensional embeddings both improve topics and clearly show users why the algorithm selects certain words.