LGCVMLOct 16, 2012

Nested Dictionary Learning for Hierarchical Organization of Imagery and Text

arXiv:1210.4872v16 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of hierarchical organization in multimodal data for researchers in computer vision and natural language processing, but it appears incremental as it builds on existing dictionary learning and probabilistic methods.

The authors developed a tree-based dictionary learning model for jointly analyzing imagery and associated text, where each image follows a path through a tree with shared nodes at the root for common characteristics and specialized nodes at the leaves for class details, and they used a nested Dirichlet process with a retrospective stick-breaking sampler to infer the tree structure.

A tree-based dictionary learning model is developed for joint analysis of imagery and associated text. The dictionary learning may be applied directly to the imagery from patches, or to general feature vectors extracted from patches or superpixels (using any existing method for image feature extraction). Each image is associated with a path through the tree (from root to a leaf), and each of the multiple patches in a given image is associated with one node in that path. Nodes near the tree root are shared between multiple paths, representing image characteristics that are common among different types of images. Moving toward the leaves, nodes become specialized, representing details in image classes. If available, words (text) are also jointly modeled, with a path-dependent probability over words. The tree structure is inferred via a nested Dirichlet process, and a retrospective stick-breaking sampler is used to infer the tree depth and width.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes