Variational Gaussian Topic Model with Invertible Neural Projections
This addresses a specific bottleneck in topic modeling for text analysis, offering an incremental improvement over existing neural methods.
The authors tackled the problem of neural topic models not incorporating word relatedness from embeddings by proposing VaGTM and VaGTM-IP, which model topics as multivariate Gaussians and use invertible neural projections, resulting in outperforming baselines and achieving more coherent topics on three benchmark corpora.
Neural topic models have triggered a surge of interest in extracting topics from text automatically since they avoid the sophisticated derivations in conventional topic models. However, scarce neural topic models incorporate the word relatedness information captured in word embedding into the modeling process. To address this issue, we propose a novel topic modeling approach, called Variational Gaussian Topic Model (VaGTM). Based on the variational auto-encoder, the proposed VaGTM models each topic with a multivariate Gaussian in decoder to incorporate word relatedness. Furthermore, to address the limitation that pre-trained word embeddings of topic-associated words do not follow a multivariate Gaussian, Variational Gaussian Topic Model with Invertible neural Projections (VaGTM-IP) is extended from VaGTM. Three benchmark text corpora are used in experiments to verify the effectiveness of VaGTM and VaGTM-IP. The experimental results show that VaGTM and VaGTM-IP outperform several competitive baselines and obtain more coherent topics.