LGMLJul 30, 2018

Faster Convergence & Generalization in DNNs

arXiv:1807.11414v31 citations
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient training and data-hungry generalization in deep learning, offering a method that could benefit practitioners in various domains, though it appears incremental as it builds on existing optimization techniques.

The paper tackles the slow training and poor generalization of deep neural networks by developing an optimization algorithm based on generalized-optimal updates, achieving two orders of magnitude speedup over traditional back-propagation and improved robustness to noise and over-fitting on benchmark datasets.

Deep neural networks have gained tremendous popularity in last few years. They have been applied for the task of classification in almost every domain. Despite the success, deep networks can be incredibly slow to train for even moderate sized models on sufficiently large datasets. Additionally, these networks require large amounts of data to be able to generalize. The importance of speeding up convergence, and generalization in deep networks can not be overstated. In this work, we develop an optimization algorithm based on generalized-optimal updates derived from minibatches that lead to faster convergence. Towards the end, we demonstrate on two benchmark datasets that the proposed method achieves two orders of magnitude speed up over traditional back-propagation, and is more robust to noise/over-fitting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes