Implementation of a modified Nesterov's Accelerated quasi-Newton Method on Tensorflow
This work addresses faster convergence for non-convex optimization in machine learning, but it is incremental as it modifies an existing method.
The paper tackled non-convex optimization by implementing a modified Nesterov's Accelerated Quasi-Newton method on TensorFlow, showing that it converges better and faster than first-order optimizers like Adam and second-order methods like quasi-Newton on benchmark problems.
Recent studies incorporate Nesterov's accelerated gradient method for the acceleration of gradient based training. The Nesterov's Accelerated Quasi-Newton (NAQ) method has shown to drastically improve the convergence speed compared to the conventional quasi-Newton method. This paper implements NAQ for non-convex optimization on Tensorflow. Two modifications have been proposed to the original NAQ algorithm to ensure global convergence and eliminate linesearch. The performance of the proposed algorithm - mNAQ is evaluated on standard non-convex function approximation benchmark problems and microwave circuit modelling problems. The results show that the improved algorithm converges better and faster compared to first order optimizers such as AdaGrad, RMSProp, Adam, and the second order methods such as the quasi-Newton method.