STTM: A Tool for Short Text Topic Modeling
This tool addresses the need for semantic understanding in social communications by providing a resource for researchers and practitioners in short text topic modeling, though it is incremental as it packages existing methods.
The authors tackled the problem of topic discovery from short texts by introducing STTM, the first comprehensive open-source Java package that integrates state-of-the-art models, benchmark datasets, and functions for inference and evaluation, facilitating method expansion and accessible comparisons.
Along with the emergence and popularity of social communications on the Internet, topic discovery from short texts becomes fundamental to many applications that require semantic understanding of textual content. As a rising research field, short text topic modeling presents a new and complementary algorithmic methodology to supplement regular text topic modeling, especially targets to limited word co-occurrence information in short texts. This paper presents the first comprehensive open-source package, called STTM, for use in Java that integrates the state-of-the-art models of short text topic modeling algorithms, benchmark datasets, and abundant functions for model inference and evaluation. The package is designed to facilitate the expansion of new methods in this research field and make evaluations between the new approaches and existing ones accessible. STTM is open-sourced at https://github.com/qiang2100/STTM.