LGDSOCMLJul 24, 2018

On the Randomized Complexity of Minimizing a Convex Quadratic Function

arXiv:1807.09386v717 citations
Originality Incremental advance
AI Analysis

This provides fundamental lower bounds for a core optimization problem in machine learning, establishing minimax rates for convex quadratic minimization.

The paper tackles the problem of proving gradient-query complexity lower bounds for minimizing convex quadratic functions, showing that any randomized algorithm requires Ω(√κ) queries to achieve a solution within a constant relative error, where κ is the condition number.

Minimizing a convex, quadratic objective of the form $f_{\mathbf{A},\mathbf{b}}(x) := \frac{1}{2}x^\top \mathbf{A} x - \langle \mathbf{b}, x \rangle$ for $\mathbf{A} \succ 0 $ is a fundamental problem in machine learning and optimization. In this work, we prove gradient-query complexity lower bounds for minimizing convex quadratic functions which apply to both deterministic and \emph{randomized} algorithms. Specifically, for $κ> 1$, we exhibit a distribution over $(\mathbf{A},\mathbf{b})$ with condition number $\mathrm{cond}(\mathbf{A}) \le κ$, such that any \emph{randomized} algorithm requires $Ω(\sqrtκ)$ gradient queries to find a solution $\hat x$ for which $\|\hat x - \mathbf x_\star\| \le ε_0\|\mathbf{x}_{\star}\|$, where $\mathbf x_{\star} = \mathbf{A}^{-1}\mathbf{b}$ is the optimal solution, and $ε_0$ a small constant. Setting $κ=1/ε$, this lower bound implies the minimax rate of $T = Ω(λ_1(\mathbf{A})\|\mathbf x_\star\|^2/\sqrtε)$ queries required to minimize an arbitrary convex quadratic function up to error $f(\hat{x}) - f(\mathbf x_\star) \le ε$. Our lower bound holds for a distribution derived from classical ensembles in random matrix theory, and relies on a careful reduction from adaptively estimating a planted vector $\mathbf u$ in a deformed Wigner model. A key step in deriving sharp lower bounds is demonstrating that the optimization error $\mathbf x_\star - \hat x$ cannot align too closely with $\mathbf{u}$. To this end, we prove an upper bound on the cosine between $\mathbf x_\star - \hat x$ and $\mathbf u$ in terms of the MMSE of estimating the plant $\mathbf u$ in a deformed Wigner model. We then bound the MMSE by carefully modifying a result due to Lelarge and Miolane 2016, which rigorously establishes a general replica-symmetric formula for planted matrix models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes