DS LG OC CO MLOct 7, 2020

Structured Logconcave Sampling with a Restricted Gaussian Oracle

arXiv:2010.03106v420.8107 citationsh-index: 33

Originality Highly original

AI Analysis

This work addresses the challenge of efficient sampling in machine learning and statistics, particularly for structured distributions, offering incremental improvements in mixing times and novel query complexities for high-accuracy samplers.

The paper tackles the problem of sampling from structured logconcave distributions, such as composite densities and logconcave finite sums, by developing algorithms that achieve high accuracy with improved mixing times and query complexities. For composite densities, it matches the state-of-the-art non-composite bound with a mixing time of O(κd log³(κd/ε)), and for logconcave finite sums, it provides a sampler with Õ(n + κ max(d, √(nd))) gradient queries, where no such efficient samplers were previously known.

We give algorithms for sampling several structured logconcave families to high accuracy. We further develop a reduction framework, inspired by proximal point methods in convex optimization, which bootstraps samplers for regularized densities to improve dependences on problem conditioning. A key ingredient in our framework is the notion of a "restricted Gaussian oracle" (RGO) for $g: \mathbb{R}^d \rightarrow \mathbb{R}$, which is a sampler for distributions whose negative log-likelihood sums a quadratic and $g$. By combining our reduction framework with our new samplers, we obtain the following bounds for sampling structured distributions to total variation distance $ε$. For composite densities $\exp(-f(x) - g(x))$, where $f$ has condition number $κ$ and convex (but possibly non-smooth) $g$ admits an RGO, we obtain a mixing time of $O(κd \log^3\frac{κd}ε)$, matching the state-of-the-art non-composite bound; no composite samplers with better mixing than general-purpose logconcave samplers were previously known. For logconcave finite sums $\exp(-F(x))$, where $F(x) = \frac{1}{n}\sum_{i \in [n]} f_i(x)$ has condition number $κ$, we give a sampler querying $\widetilde{O}(n + κ\max(d, \sqrt{nd}))$ gradient oracles to $\{f_i\}_{i \in [n]}$; no high-accuracy samplers with nontrivial gradient query complexity were previously known. For densities with condition number $κ$, we give an algorithm obtaining mixing time $O(κd \log^2\frac{κd}ε)$, improving the prior state-of-the-art by a logarithmic factor with a significantly simpler analysis; we also show a zeroth-order algorithm attains the same query complexity.

View on arXiv PDF

Similar