ML LGDec 4, 2013

Multiscale Dictionary Learning for Estimating Conditional Distributions

Francesca Petralia, Joshua Vogelstein, David B. Dunson

arXiv:1312.1099v14 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of flexibly modeling conditional distributions in high-dimensional data, particularly for neuroscience, but is incremental as it builds on existing dictionary learning and tree decomposition methods.

The paper tackles the problem of nonparametric estimation of conditional distributions for high-dimensional features, proposing a multiscale dictionary learning model that scales to approximately one million features and demonstrates state-of-the-art predictive performance in neuroscience applications.

Nonparametric estimation of the conditional distribution of a response given high-dimensional features is a challenging problem. It is important to allow not only the mean but also the variance and shape of the response density to change flexibly with features, which are massive-dimensional. We propose a multiscale dictionary learning model, which expresses the conditional response density as a convex combination of dictionary densities, with the densities used and their weights dependent on the path through a tree decomposition of the feature space. A fast graph partitioning algorithm is applied to obtain the tree decomposition, with Bayesian methods then used to adaptively prune and average over different sub-trees in a soft probabilistic manner. The algorithm scales efficiently to approximately one million features. State of the art predictive performance is demonstrated for toy examples and two neuroscience applications including up to a million features.

View on arXiv PDF

Similar