CV LG MLMay 23, 2013

A Supervised Neural Autoregressive Topic Model for Simultaneous Image Classification and Annotation

Yin Zheng, Yu-Jin Zhang, Hugo Larochelle

arXiv:1305.5306v11 citations

Originality Incremental advance

AI Analysis

This work addresses scene recognition and annotation for computer vision applications, but it is incremental as it extends an existing model to a new domain with supervised enhancements.

The paper tackled the problem of visual scene modeling by proposing SupDocNADE, a supervised neural autoregressive topic model that incorporates label information and spatial data to simultaneously perform image classification and annotation, achieving favorable performance compared to other topic models on datasets like Scene15, LabelMe, and UIUC-Sports.

Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to perform scene recognition and annotation. Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for document modeling. In this work, we show how to successfully apply and extend this model to the context of visual scene modeling. Specifically, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the hidden topic features by incorporating label information into the training objective of the model. We also describe how to leverage information about the spatial position of the visual words and how to embed additional image annotations, so as to simultaneously perform image classification and annotation. We test our model on the Scene15, LabelMe and UIUC-Sports datasets and show that it compares favorably to other topic models such as the supervised variant of LDA.

View on arXiv PDF

Similar