Joe Klobusicky

0.9CVMay 21, 2017

Convergence of backpropagation with momentum for network architectures with skip connections

Chirag Agarwal, Joe Klobusicky, Dan Schonfeld

We study a class of deep neural networks with networks that form a directed acyclic graph (DAG). For backpropagation defined by gradient descent with adaptive momentum, we show weights converge for a large class of nonlinear activation functions. The proof generalizes the results of Wu et al. (2008) who showed convergence for a feed forward network with one hidden layer. For an example of the effectiveness of DAG architectures, we describe an example of compression through an autoencoder, and compare against sequential feed forward networks under several metrics.

Joe Klobusicky

1 Paper