Hierarchical Graph Topic Modeling with Topic Tree-based Transformer
This addresses the challenge of integrating textual semantics and graph connectivity for researchers in natural language processing and graph learning, though it appears incremental as it combines existing ideas like hyperbolic embeddings and hierarchical topic modeling.
The paper tackles the problem of modeling both hierarchical topic structures within documents and hierarchical graph structures across interlinked documents, proposing a unified Transformer-based model that achieves effectiveness in supervised and unsupervised experiments.
Textual documents are commonly connected in a hierarchical graph structure where a central document links to others with an exponentially growing connectivity. Though Hyperbolic Graph Neural Networks (HGNNs) excel at capturing such graph hierarchy, they cannot model the rich textual semantics within documents. Moreover, text contents in documents usually discuss topics of different specificity. Hierarchical Topic Models (HTMs) discover such latent topic hierarchy within text corpora. However, most of them focus on the textual content within documents, and ignore the graph adjacency across interlinked documents. We thus propose a Hierarchical Graph Topic Modeling Transformer to integrate both topic hierarchy within documents and graph hierarchy across documents into a unified Transformer. Specifically, to incorporate topic hierarchy within documents, we design a topic tree and infer a hierarchical tree embedding for hierarchical topic modeling. To preserve both topic and graph hierarchies, we design our model in hyperbolic space and propose Hyperbolic Doubly Recurrent Neural Network, which models ancestral and fraternal tree structure. Both hierarchies are inserted into each Transformer layer to learn unified representations. Both supervised and unsupervised experiments verify the effectiveness of our model.