ISAAC Newton: Input-based Approximate Curvature for Newton's Method
This addresses the computational bottleneck in optimization for machine learning practitioners, though it appears incremental as it builds on existing Newton's method approaches.
The paper tackles the computational overhead of second-order optimization methods by proposing ISAAC, an input-based approximate curvature method that conditions gradients with selected second-order information. The method achieves effective training in small-batch regimes with asymptotically vanishing computational overhead, making it competitive with both first-order and second-order methods.
We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods.