A Query-Driven Topic Model
This addresses the need for more accessible and user-friendly topic modeling in social sciences and digital humanities, though it is incremental as it builds on existing topic modeling methods.
The paper tackles the problem of enabling users to find topics related to specific aspects in a corpus without requiring domain expert input, by proposing a query-driven topic model that returns query-related topics based on simple word or phrase queries, and experimental results show its effectiveness compared to classical and neural topic models.
Topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus. It has been increasingly widely adopted as a tool in the social sciences, including political science, digital humanities and sociological research in general. One desirable property of topic models is to allow users to find topics describing a specific aspect of the corpus. A possible solution is to incorporate domain-specific knowledge into topic modeling, but this requires a specification from domain experts. We propose a novel query-driven topic model that allows users to specify a simple query in words or phrases and return query-related topics, thus avoiding tedious work from domain experts. Our proposed approach is particularly attractive when the user-specified query has a low occurrence in a text corpus, making it difficult for traditional topic models built on word cooccurrence patterns to identify relevant topics. Experimental results demonstrate the effectiveness of our model in comparison with both classical topic models and neural topic models.