LGOCMLMay 16, 2018

End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition

arXiv:1805.06523v115 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficiently training deep convolutional networks, particularly for researchers in machine learning, though it appears incremental as it builds on existing tensor decomposition methods.

The paper tackles the problem of learning convolutional neural network weights by proposing Deep Tensor Decomposition (DeepTD), a data-efficient algorithm based on rank-1 tensor decomposition, which provably works with sample sizes exceeding the total number of convolutional weights.

In this paper we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches with a single kernel in each layer. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed Deep Tensor Decomposition (DeepTD) is based on a rank-1 tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is data-efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network. We carry out a variety of numerical experiments to investigate the effectiveness of DeepTD and verify our theoretical findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes