MLCLLGJun 12, 2018

Sparse Stochastic Zeroth-Order Optimization with an Application to Bandit Structured Prediction

arXiv:1806.04458v37 citations
AI Analysis

This work addresses efficiency issues in gradient-free optimization for sparse problems like structured prediction, offering incremental improvements over existing methods.

The paper tackles the high iteration complexity of stochastic zeroth-order optimization in high-dimensional settings by leveraging sparsity patterns, reducing the complexity factor to the expected number of active features. Experimental results on bandit structured prediction tasks with sparse features confirm the theoretical improvements.

Stochastic zeroth-order (SZO), or gradient-free, optimization allows to optimize arbitrary functions by relying only on function evaluations under parameter perturbations, however, the iteration complexity of SZO methods suffers a factor proportional to the dimensionality of the perturbed function. We show that in scenarios with natural sparsity patterns as in structured prediction applications, this factor can be reduced to the expected number of active features over input-output pairs. We give a general proof that applies sparse SZO optimization to Lipschitz-continuous, nonconvex, stochastic objectives, and present an experimental evaluation on linear bandit structured prediction tasks with sparse word-based feature representations that confirm our theoretical results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes