ETLGJun 20, 2017

Analog CMOS-based Resistive Processing Unit for Deep Neural Network Training

arXiv:1706.06620v149 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of slow DNN training for AI researchers and practitioners by offering a hardware solution, though it appears incremental as it builds on prior RPU concepts with a new CMOS implementation.

The authors tackled the challenge of accelerating deep neural network training by proposing an analog CMOS-based resistive processing unit (CMOS RPU) design, which enables local data storage and processing with massive parallelism, addressing limitations of existing non-volatile memory technologies.

Recently we have shown that an architecture based on resistive processing unit (RPU) devices has potential to achieve significant acceleration in deep neural network (DNN) training compared to today's software-based DNN implementations running on CPU/GPU. However, currently available device candidates based on non-volatile memory technologies do not satisfy all the requirements to realize the RPU concept. Here, we propose an analog CMOS-based RPU design (CMOS RPU) which can store and process data locally and can be operated in a massively parallel manner. We analyze various properties of the CMOS RPU to evaluate the functionality and feasibility for acceleration of DNN training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes