NE LGNov 3, 2017

Accelerating Training of Deep Neural Networks via Sparse Edge Processing

Sourya Dey, Yinan Shao, Keith M. Chugg, Peter A. Beerel

arXiv:1711.01343v15 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of enabling extensive parameter searches and theoretical development in DNNs for researchers and practitioners, though it appears incremental as it builds on existing sparsity and hardware acceleration concepts.

The paper tackles the problem of high memory and computational requirements in deep neural network training by proposing a reconfigurable hardware architecture with structured sparsity, achieving up to 30x reduction in network complexity and 35x faster training compared to GPUs while maintaining inference fidelity.

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational requirements. This novel architecture introduces the notion of edge-processing to provide flexibility and combines junction pipelining and operational parallelization to speed up training. The overall effect is to reduce network complexity by factors up to 30x and training time by up to 35x relative to GPUs, while maintaining high fidelity of inference results. This has the potential to enable extensive parameter searches and development of the largely unexplored theoretical foundation of DNNs. The architecture automatically adapts itself to different network sizes given available hardware resources. As proof of concept, we show results obtained for different bit widths.

View on arXiv PDF

Similar