Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method
This work addresses a theoretical inverse problem with implications for data privacy and adversarial security in machine learning, representing an incremental advancement in leveraging score methods.
The paper tackles the problem of recovering model parameters from leverage score gradients, introducing an iterative algorithm that solves a regularized least squares problem with an approximate Hessian using subsampled leverage scores, achieving a per-iteration cost of O((nnz(A) + d^ω) * poly(log(n/δ))) and requiring T = log(||x_0 - x*||_2/ε) iterations.
Leverage scores have become essential in statistics and machine learning, aiding regression analysis, randomized matrix computations, and various other tasks. This paper delves into the inverse problem, aiming to recover the intrinsic model parameters given the leverage scores gradient. This endeavor not only enriches the theoretical understanding of models trained with leverage score techniques but also has substantial implications for data privacy and adversarial security. We specifically scrutinize the inversion of the leverage score gradient, denoted as $g(x)$. An innovative iterative algorithm is introduced for the approximate resolution of the regularized least squares problem stated as $\min_{x \in \mathbb{R}^d} 0.5 \|g(x) - c\|_2^2 + 0.5\|\mathrm{diag}(w)Ax\|_2^2$. Our algorithm employs subsampled leverage score distributions to compute an approximate Hessian in each iteration, under standard assumptions, considerably mitigating the time complexity. Given that a total of $T = \log(\| x_0 - x^* \|_2/ ε)$ iterations are required, the cost per iteration is optimized to the order of $O( (\mathrm{nnz}(A) + d^ω ) \cdot \mathrm{poly}(\log(n/δ))$, where $\mathrm{nnz}(A)$ denotes the number of non-zero entries of $A$.