CVIRLGNov 4, 2013

TOP-SPIN: TOPic discovery via Sparse Principal component INterference

arXiv:1311.1406v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses topic discovery in computer vision for unlabeled image datasets, but it appears incremental as it builds on existing sparse PCA and bag-of-words methods.

The authors tackled the problem of discovering topics in unlabeled images by proposing a sparse principal component analysis (PCA) method within a bag-of-words framework, resulting in an efficient algorithm that shows encouraging performance in experiments for topic discovery and category prediction.

We propose a novel topic discovery algorithm for unlabeled images based on the bag-of-words (BoW) framework. We first extract a dictionary of visual words and subsequently for each image compute a visual word occurrence histogram. We view these histograms as rows of a large matrix from which we extract sparse principal components (PCs). Each PC identifies a sparse combination of visual words which co-occur frequently in some images but seldom appear in others. Each sparse PC corresponds to a topic, and images whose interference with the PC is high belong to that topic, revealing the common parts possessed by the images. We propose to solve the associated sparse PCA problems using an Alternating Maximization (AM) method, which we modify for purpose of efficiently extracting multiple PCs in a deflation scheme. Our approach attacks the maximization problem in sparse PCA directly and is scalable to high-dimensional data. Experiments on automatic topic discovery and category prediction demonstrate encouraging performance of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes