LGIRMLOct 16, 2012

Factorized Multi-Modal Topic Model

arXiv:1210.4920v143 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better analysis of multi-modal data collections, such as image-text pairs, by providing a method that avoids forcing dependencies between minimally correlating modalities, though it is incremental in combining existing approaches.

The authors tackled the problem of analyzing multi-modal data, specifically paired images and text, by developing a novel topic model that learns both shared and private topics, enabling more accurate cross-modal querying.

Multi-modal data collections, such as corpora of paired images and text snippets, require analysis methods beyond single-view component and topic models. For continuous observations the current dominant approach is based on extensions of canonical correlation analysis, factorizing the variation into components shared by the different modalities and those private to each of them. For count data, multiple variants of topic models attempting to tie the modalities together have been presented. All of these, however, lack the ability to learn components private to one modality, and consequently will try to force dependencies even between minimally correlating modalities. In this work we combine the two approaches by presenting a novel HDP-based topic model that automatically learns both shared and private topics. The model is shown to be especially useful for querying the contents of one domain given samples of the other.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes