Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging
This work addresses optimization challenges in specific domains like quantum computing and inverse problems, offering incremental improvements in computational efficiency for researchers and practitioners in these fields.
The paper tackles the problem of minimizing expected logarithmic loss in convex optimization tasks like Poisson inverse problems and quantum state tomography, where standard first-order methods fail due to lack of Lipschitz continuity and smoothness. It proposes a stochastic algorithm that achieves state-of-the-art time complexities, such as O(d^2/ε^2) for Poisson inverse problems and O(d^3/ε^2) for quantum state tomography, improving on existing methods by factors like d^(ω-2) and d^2.
Consider the problem of minimizing an expected logarithmic loss over either the probability simplex or the set of quantum density matrices. This problem includes tasks such as solving the Poisson inverse problem, computing the maximum-likelihood estimate for quantum state tomography, and approximating positive semi-definite matrix permanents with the currently tightest approximation ratio. Although the optimization problem is convex, standard iteration complexity guarantees for first-order methods do not directly apply due to the absence of Lipschitz continuity and smoothness in the loss function. In this work, we propose a stochastic first-order algorithm named $B$-sample stochastic dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d^2/\varepsilon^2)$ time, matching the state of the art, where $d$ denotes the dimension. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d^3/\varepsilon^2)$ time. This improves on the time complexities of existing stochastic first-order methods by a factor of $d^{ω-2}$ and those of batch methods by a factor of $d^2$, where $ω$ denotes the matrix multiplication exponent. Numerical experiments demonstrate that empirically, our algorithm outperforms existing methods with explicit complexity guarantees.