LG NE MLJun 28, 2015

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

Majid Janzamin, Hanie Sedghi, Anima Anandkumar

arXiv:1506.08473v332.8144 citations

Originality Highly original

AI Analysis

This provides a guaranteed training method for two-layer neural networks, addressing a foundational challenge in machine learning, though it is incremental as it focuses on specific network architectures and conditions.

The paper tackles the non-convex optimization problem in training neural networks by proposing a tensor decomposition algorithm that provably converges to the global optimum under mild conditions, with polynomial sample complexity and competitive computational efficiency compared to SGD.

Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks. We provide risk bounds for our proposed method, with a polynomial sample complexity in the relevant parameters, such as input dimension and number of neurons. While learning arbitrary target functions is NP-hard, we provide transparent conditions on the function and the input for learnability. Our training method is based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions. It consists of simple embarrassingly parallel linear and multi-linear operations, and is competitive with standard stochastic gradient descent (SGD), in terms of computational complexity. Thus, we propose a computationally efficient method with guaranteed risk bounds for training neural networks with one hidden layer.

View on arXiv PDF

Similar