Algorithms for solving optimization problems arising from deep neural net models: smooth problems
This work tackles optimization bottlenecks for researchers and practitioners applying deep learning, but it appears incremental as it builds on existing Newton methods.
The paper addresses the challenge of solving highly nonlinear optimization problems in deep neural networks by proposing a Newton-based method with directions of negative curvature, showing promising numerical results on security anomaly detection data.
Machine Learning models incorporating multiple layered learning networks have been seen to provide effective models for various classification problems. The resulting optimization problem to solve for the optimal vector minimizing the empirical risk is, however, highly nonlinear. This presents a challenge to application and development of appropriate optimization algorithms for solving the problem. In this paper, we summarize the primary challenges involved and present the case for a Newton-based method incorporating directions of negative curvature, including promising numerical results on data arising from security anomally deetection.