An Investigation of Newton-Sketch and Subsampled Newton Methods
This work addresses optimization efficiency for large datasets and variables, but it is incremental as it compares existing methods without introducing a new paradigm.
The paper investigates two sketching techniques, Hessian subsampling and randomized Hadamard transformations, for Newton's method in large-scale finite-sum optimization, revealing trade-offs and advantages through numerical experiments and complexity analysis.
Sketching, a dimensionality reduction technique, has received much attention in the statistics community. In this paper, we study sketching in the context of Newton's method for solving finite-sum optimization problems in which the number of variables and data points are both large. We study two forms of sketching that perform dimensionality reduction in data space: Hessian subsampling and randomized Hadamard transformations. Each has its own advantages, and their relative tradeoffs have not been investigated in the optimization literature. Our study focuses on practical versions of the two methods in which the resulting linear systems of equations are solved approximately, at every iteration, using an iterative solver. The advantages of using the conjugate gradient method vs. a stochastic gradient iteration are revealed through a set of numerical experiments, and a complexity analysis of the Hessian subsampling method is presented.