Concentration inequalities for semidefinite least squares based on data
This work addresses the need for reliable, data-driven optimization in machine learning and statistics, offering a practical certificate for semidefinite programming, but it is incremental as it builds on existing least squares and concentration inequality frameworks.
The paper tackles the problem of providing finite-sample guarantees for semidefinite least squares problems with relaxed constraints, deriving a high-confidence bound that ensures eigenvalues of solutions are ε-close to those enforced by constraints, with the bound shrinking as data increases. It also establishes error bounds for gradient descent iterates when learning unknown quadratic functions.
We study data-driven least squares (LS) problems with semidefinite (SD) constraints and derive finite-sample guarantees on the spectrum of their optimal solutions when these constraints are relaxed. In particular, we provide a high confidence bound allowing one to solve a simpler program in place of the full SDLS problem, while ensuring that the eigenvalues of the resulting solution are $\varepsilon$-close of those enforced by the SD constraints. The developed certificate, which consistently shrinks as the number of data increases, turns out to be easy-to-compute, distribution-free, and only requires independent and identically distributed samples. Moreover, when the SDLS is used to learn an unknown quadratic function, we establish bounds on the error between a gradient descent iterate minimizing the surrogate cost obtained with no SD constraints and the true minimizer.