A Joint Learning Approach for Semi-supervised Neural Topic Modeling
This work addresses the need for better interpretable text representation in semi-supervised settings, but it is incremental as it builds on existing neural topic models.
The authors tackled the problem of improving topic modeling by introducing a semi-supervised neural topic model, LI-NTM, which outperforms existing models in document reconstruction benchmarks, especially with low labeled data and informative labels.
Topic models are some of the most popular ways to represent textual data in an interpret-able manner. Recently, advances in deep generative models, specifically auto-encoding variational Bayes (AEVB), have led to the introduction of unsupervised neural topic models, which leverage deep generative models as opposed to traditional statistics-based topic models. We extend upon these neural topic models by introducing the Label-Indexed Neural Topic Model (LI-NTM), which is, to the extent of our knowledge, the first effective upstream semi-supervised neural topic model. We find that LI-NTM outperforms existing neural topic models in document reconstruction benchmarks, with the most notable results in low labeled data regimes and for data-sets with informative labels; furthermore, our jointly learned classifier outperforms baseline classifiers in ablation studies.