OC LG MLMay 27, 2022

Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming

arXiv:2205.13687v516.214 citationsh-index: 67Has Code

Originality Incremental advance

AI Analysis

This work addresses computational efficiency for researchers and practitioners in optimization and statistics, though it is incremental as it builds on existing stochastic SQP methods.

The paper tackles the computational cost of online statistical inference for constrained stochastic nonlinear optimization by using a sketched sequential quadratic programming method, achieving asymptotic normality of the rescaled primal-dual sequence with a non-vanishing approximation error and demonstrating results on benchmark problems.

We consider online statistical inference of constrained stochastic nonlinear optimization problems. We apply the Stochastic Sequential Quadratic Programming (StoSQP) method to solve these problems, which can be regarded as applying second-order Newton's method to the Karush-Kuhn-Tucker (KKT) conditions. In each iteration, the StoSQP method computes the Newton direction by solving a quadratic program, and then selects a proper adaptive stepsize $\barα_t$ to update the primal-dual iterate. To reduce dominant computational cost of the method, we inexactly solve the quadratic program in each iteration by employing an iterative sketching solver. Notably, the approximation error of the sketching solver need not vanish as iterations proceed, meaning that the per-iteration computational cost does not blow up. For the above StoSQP method, we show that under mild assumptions, the rescaled primal-dual sequence $1/\sqrt{\barα_t}\cdot (x_t - x^\star, λ_t - λ^\star)$ converges to a mean-zero Gaussian distribution with a nontrivial covariance matrix depending on the underlying sketching distribution. To perform inference in practice, we also analyze a plug-in covariance matrix estimator. We illustrate the asymptotic normality result of the method both on benchmark nonlinear problems in CUTEst test set and on linearly/nonlinearly constrained regression problems.

View on arXiv PDF Code

Similar