Discovering Discrete Latent Topics with Neural Variational Inference
This work addresses the problem of scalable topic modeling for researchers and practitioners, offering incremental improvements over traditional methods.
The paper tackles the difficulty of fast and accurate inference in expressive topic models by introducing neural variational inference methods that allow training via backpropagation, including a recurrent network for unbounded topic discovery, and demonstrates effectiveness on datasets like MXM Song Lyrics, 20NewsGroups, and Reuters News.
Topic models have been widely explored as probabilistic generative models of documents. Traditional inference methods have sought closed-form derivations for updating the models, however as the expressiveness of these models grows, so does the difficulty of performing fast and accurate inference over their parameters. This paper presents alternative neural approaches to topic modelling by providing parameterisable distributions over topics which permit training by backpropagation in the framework of neural variational inference. In addition, with the help of a stick-breaking construction, we propose a recurrent network that is able to discover a notionally unbounded number of topics, analogous to Bayesian non-parametric topic models. Experimental results on the MXM Song Lyrics, 20NewsGroups and Reuters News datasets demonstrate the effectiveness and efficiency of these neural topic models.