LGNEMay 3, 2015

Highway Networks

arXiv:1505.00387v21895 citations
Originality Highly original
AI Analysis

This addresses the open problem of training very deep networks for machine learning practitioners, representing a novel architectural advancement rather than an incremental improvement.

The paper tackles the problem of training very deep neural networks by introducing highway networks with gating units that regulate information flow, enabling direct training of networks with hundreds of layers using stochastic gradient descent.

There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success. However, network training becomes more difficult with increasing depth and training of very deep networks remains an open problem. In this extended abstract, we introduce a new architecture designed to ease gradient-based training of very deep networks. We refer to networks with this architecture as highway networks, since they allow unimpeded information flow across several layers on "information highways". The architecture is characterized by the use of gating units which learn to regulate the flow of information through a network. Highway networks with hundreds of layers can be trained directly using stochastic gradient descent and with a variety of activation functions, opening up the possibility of studying extremely deep and efficient architectures.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes