LGMLOct 16, 2012

Exploiting compositionality to explore a large space of model structures

arXiv:1210.4856v1112 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of model selection in unsupervised learning for researchers and practitioners, offering a generalizable approach across diverse datasets, though it is incremental in automating existing model exploration.

The authors tackled the problem of automatically selecting appropriate probabilistic model structures for datasets by organizing matrix decomposition models into a context-free grammar, enabling efficient evaluation of nearly 2500 structures with a greedy search that typically finds correct structures for synthetic data and adapts to noise.

The recent proliferation of richly structured probabilistic models raises the question of how to automatically determine an appropriate model for a dataset. We investigate this question for a space of matrix decomposition models which can express a variety of widely used models from unsupervised learning. To enable model selection, we organize these models into a context-free grammar which generates a wide variety of structures through the compositional application of a few simple rules. We use our grammar to generically and efficiently infer latent components and estimate predictive likelihood for nearly 2500 structures using a small toolbox of reusable algorithms. Using a greedy search over our grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs off gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes