ET NEJan 6, 2018

Design Exploration of Hybrid CMOS-OxRAM Deep Generative Architectures

arXiv:1801.02003v11.2

Originality Synthesis-oriented

AI Analysis

This work addresses hardware efficiency for deep learning applications, though it is incremental as it adapts existing methods to new hardware components.

The paper tackled the challenge of implementing deep generative models using hybrid CMOS-OxRAM architectures, achieving a top-3 test accuracy of 95.5% for a Deep Belief Network and a mean squared error of 0.003 for a Stacked Denoising Autoencoder on a reduced MNIST dataset.

Deep Learning and its applications have gained tremendous interest recently in both academia and industry. Restricted Boltzmann Machines (RBMs) offer a key methodology to implement deep learning paradigms. This paper presents a novel approach for realizing hybrid CMOS-OxRAM based deep generative models (DGM). In our proposed hybrid DGM architectures, HfOx based (filamentary-type switching) OxRAM devices are extensively used for realizing multiple computational and non-computational functions such as: (i) Synapses (weights), (ii) internal neuron-state storage, (iii) stochastic neuron activation and (iv) programmable signal normalization. To validate the proposed scheme we have simulated two different architectures: (i) Deep Belief Network (DBN) and (ii) Stacked Denoising Autoencoder for classification and reconstruction of hand-written digits from a reduced MNIST dataset of 6000 images. Contrastive-divergence (CD) specially optimized for OxRAM devices was used to drive the synaptic weight update mechanism of each layer in the network. Overall learning rule was based on greedy-layer wise learning with no back propagation which allows the network to be trained to a good pre-training stage. Performance of the simulated hybrid CMOS-RRAM DGM model matches closely with software based model for a 2-layers deep network. Top-3 test accuracy achieved by the DBN was 95.5%. MSE of the SDA network was 0.003, lower than software based approach. Endurance analysis of the simulated architectures show that for 200 epochs of training (single RBM layer), maximum switching events/per OxRAM device was ~ 7000 cycles.

View on arXiv PDF

Similar