ML IR LGSep 3, 2019

Discriminative Topic Modeling with Logistic LDA

Iryna Korshunova, Hanchen Xiong, Mateusz Fedoryszak, Lucas Theis

arXiv:1909.01436v24.119 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a bottleneck for researchers and practitioners working with rich, non-categorical data in fields like computer vision and NLP, offering an incremental improvement over existing topic models.

The paper tackles the challenge of applying latent Dirichlet allocation (LDA) to non-categorical data like images and text embeddings by proposing logistic LDA, a discriminative variant that maintains interpretability and can learn unsupervised from group structures.

Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the vast literature on LDA. We propose logistic LDA, a novel discriminative variant of latent Dirichlet allocation which is easy to apply to arbitrary inputs. In particular, our model can easily be applied to groups of images, arbitrary text embeddings, and integrates well with deep neural networks. Although it is a discriminative model, we show that logistic LDA can learn from unlabeled data in an unsupervised manner by exploiting the group structure present in the data. In contrast to other recent topic models designed to handle arbitrary inputs, our model does not sacrifice the interpretability and principled motivation of LDA.

View on arXiv PDF Code

Similar