LG MLNov 22, 2018

Enhanced Expressive Power and Fast Training of Neural Networks by Random Projections

Jian-Feng Cai, Dong Li, Jiaze Sun, Ke Wang

arXiv:1811.09054v24.77 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency challenges in deep learning for researchers and practitioners dealing with high-dimensional data, offering a method to reduce computational costs, though it is incremental as it builds on existing random projection techniques.

The paper tackles the problem of high parameter count and slow training in neural networks by leveraging random projections to reduce dimensionality for sparse or manifold-structured data, proving that neuron requirements depend weakly on input dimension and proposing models that accelerate training with minimal performance loss.

Random projections are able to perform dimension reduction efficiently for datasets with nonlinear low-dimensional structures. One well-known example is that random matrices embed sparse vectors into a low-dimensional subspace nearly isometrically, known as the restricted isometric property in compressed sensing. In this paper, we explore some applications of random projections in deep neural networks. We provide the expressive power of fully connected neural networks when the input data are sparse vectors or form a low-dimensional smooth manifold. We prove that the number of neurons required for approximating a Lipschitz function with a prescribed precision depends on the sparsity or the dimension of the manifold and weakly on the dimension of the input vector. The key in our proof is that random projections embed stably the set of sparse vectors or a low-dimensional smooth manifold into a low-dimensional subspace. Based on this fact, we also propose some new neural network models, where at each layer the input is first projected onto a low-dimensional subspace by a random projection and then the standard linear connection and non-linear activation are applied. In this way, the number of parameters in neural networks is significantly reduced, and therefore the training of neural networks can be accelerated without too much performance loss.

View on arXiv PDF

Similar