NEApr 30, 2024
DelGrad: Exact event-based gradients for training delays and weights on spiking neuromorphic hardwareJulian Göltz, Jimmy Weber, Laura Kriener et al.
Spiking neural networks (SNNs) inherently rely on the timing of signals for representing and processing information. Incorporating trainable transmission delays, alongside synaptic weights, is crucial for shaping these temporal dynamics. While recent methods have shown the benefits of training delays and weights in terms of accuracy and memory efficiency, they rely on discrete time, approximate gradients, and full access to internal variables like membrane potentials. This limits their precision, efficiency, and suitability for neuromorphic hardware due to increased memory requirements and I/O bandwidth demands. To address these challenges, we propose DelGrad, an analytical, event-based method to compute exact loss gradients for both synaptic weights and delays. The inclusion of delays in the training process emerges naturally within our proposed formalism, enriching the model's search space with a temporal dimension. Moreover, DelGrad, grounded purely in spike timing, eliminates the need to track additional variables such as membrane potentials. To showcase this key advantage, we demonstrate the functionality and benefits of DelGrad on the BrainScaleS-2 neuromorphic platform, by training SNNs in a chip-in-the-loop fashion. For the first time, we experimentally demonstrate the memory efficiency and accuracy benefits of adding delays to SNNs on noisy mixed-signal hardware. Additionally, these experiments also reveal the potential of delays for stabilizing networks against noise. DelGrad opens a new way for training SNNs with delays on neuromorphic hardware, which results in fewer required parameters, higher accuracy and ease of hardware training.
ARMay 13, 2025
MINIMALIST: switched-capacitor circuits for efficient in-memory computation of gated recurrent unitsSebastian Billaudelle, Laura Kriener, Filippo Moro et al.
Recurrent neural networks (RNNs) have been a long-standing candidate for processing of temporal sequence data, especially in memory-constrained systems that one may find in embedded edge computing environments. Recent advances in training paradigms have now inspired new generations of efficient RNNs. We introduce a streamlined and hardware-compatible architecture based on minimal gated recurrent units (GRUs), and an accompanying efficient mixed-signal hardware implementation of the model. The proposed design leverages switched-capacitor circuits not only for in-memory computation (IMC), but also for the gated state updates. The mixed-signal cores rely solely on commodity circuits consisting of metal capacitors, transmission gates, and a clocked comparator, thus greatly facilitating scaling and transfer to other technology nodes. We benchmark the performance of our architecture on time series data, introducing all constraints required for a direct mapping to the hardware system. The direct compatibility is verified in mixed-signal simulations, reproducing data recorded from the software-only network model.
NEJan 26, 2022
The BrainScaleS-2 accelerated neuromorphic system with hybrid plasticityChristian Pehle, Sebastian Billaudelle, Benjamin Cramer et al.
Since the beginning of information processing by electronic components, the nervous system has served as a metaphor for the organization of computational primitives. Brain-inspired computing today encompasses a class of approaches ranging from using novel nano-devices for computation to research into large-scale neuromorphic architectures, such as TrueNorth, SpiNNaker, BrainScaleS, Tianjic, and Loihi. While implementation details differ, spiking neural networks - sometimes referred to as the third generation of neural networks - are the common abstraction used to model computation with such systems. Here we describe the second generation of the BrainScaleS neuromorphic architecture, emphasizing applications enabled by this architecture. It combines a custom analog accelerator core supporting the accelerated physical emulation of bio-inspired spiking neural network primitives with a tightly coupled digital processor and a digital event-routing network.
ARMar 29, 2021
Demonstrating Analog Inference on the BrainScaleS-2 Mobile SystemYannik Stradmann, Sebastian Billaudelle, Oliver Breitwieser et al.
We present the BrainScaleS-2 mobile system as a compact analog inference engine based on the BrainScaleS-2 ASIC and demonstrate its capabilities at classifying a medical electrocardiogram dataset. The analog network core of the ASIC is utilized to perform the multiply-accumulate operations of a convolutional deep neural network. At a system power consumption of 5.6W, we measure a total energy consumption of 192uJ for the ASIC and achieve a classification time of 276us per electrocardiographic patient sample. Patients with atrial fibrillation are correctly identified with a detection rate of (93.7${\pm}$0.7)% at (14.0${\pm}$1.0)% false positives. The system is directly applicable to edge inference applications due to its small size, power envelope, and flexible I/O capabilities. It has enabled the BrainScaleS-2 ASIC to be operated reliably outside a specialized lab setting. In future applications, the system allows for a combination of conventional machine learning layers with online learning in spiking neural networks on a single neuromorphic platform.
ETAug 3, 2020
Spiking neuromorphic chip learns entangled quantum statesStefanie Czischek, Andreas Baumbach, Sebastian Billaudelle et al.
The approximation of quantum states with artificial neural networks has gained a lot of attention during the last years. Meanwhile, analog neuromorphic chips, inspired by structural and dynamical properties of the biological brain, show a high energy efficiency in running artificial neural-network architectures for the profit of generative applications. This encourages employing such hardware systems as platforms for simulations of quantum systems. Here we report on the realization of a prototype using the latest spike-based BrainScaleS hardware allowing us to represent few-qubit maximally entangled quantum states with high fidelities. Bell correlations of pure and mixed two-qubit states are well captured by the analog hardware, demonstrating an important building block for simulating quantum systems with spiking neuromorphic chips.
NEJun 23, 2020
Inference with Artificial Neural Networks on Analog Neuromorphic HardwareJohannes Weis, Philipp Spilger, Sebastian Billaudelle et al.
The neuromorphic BrainScaleS-2 ASIC comprises mixed-signal neurons and synapse circuits as well as two versatile digital microprocessors. Primarily designed to emulate spiking neural networks, the system can also operate in a vector-matrix multiplication and accumulation mode for artificial neural networks. Analog multiplication is carried out in the synapse circuits, while the results are accumulated on the neurons' membrane capacitors. Designed as an analog, in-memory computing device, it promises high energy efficiency. Fixed-pattern noise and trial-to-trial variations, however, require the implemented networks to cope with a certain level of perturbations. Further limitations are imposed by the digital resolution of the input values (5 bit), matrix weights (6 bit) and resulting neuron activations (8 bit). In this paper, we discuss BrainScaleS-2 as an analog inference accelerator and present calibration as well as optimization strategies, highlighting the advantages of training with hardware in the loop. Among other benchmarks, we classify the MNIST handwritten digits dataset using a two-dimensional convolution and two dense layers. We reach 98.0% test accuracy, closely matching the performance of the same network evaluated in software.
NEJun 23, 2020
hxtorch: PyTorch for BrainScaleS-2 -- Perceptrons on Analog Neuromorphic HardwarePhilipp Spilger, Eric Müller, Arne Emmel et al.
We present software facilitating the usage of the BrainScaleS-2 analog neuromorphic hardware system as an inference accelerator for artificial neural networks. The accelerator hardware is transparently integrated into the PyTorch machine learning framework using its extension interface. In particular, we provide accelerator support for vector-matrix multiplications and convolutions; corresponding software-based autograd functionality is provided for hardware-in-the-loop training. Automatic partitioning of neural networks onto one or multiple accelerator chips is supported. We analyze implementation runtime overhead during training as well as inference, provide measurements for existing setups and evaluate the results in terms of the accelerator hardware design limitations. As an application of the introduced framework, we present a model that classifies activities of daily living with smartphone sensor data.
NEJun 12, 2020
Surrogate gradients for analog neuromorphic computingBenjamin Cramer, Sebastian Billaudelle, Simeon Kanya et al.
To rapidly process temporal information at a low metabolic cost, biological neurons integrate inputs as an analog sum but communicate with spikes, binary events in time. Analog neuromorphic hardware uses the same principles to emulate spiking neural networks with exceptional energy-efficiency. However, instantiating high-performing spiking networks on such hardware remains a significant challenge due to device mismatch and the lack of efficient training algorithms. Here, we introduce a general in-the-loop learning framework based on surrogate gradients that resolves these issues. Using the BrainScaleS-2 neuromorphic system, we show that learning self-corrects for device mismatch resulting in competitive spiking network performance on both vision and speech benchmarks. Our networks display sparse spiking activity with, on average, far less than one spike per hidden neuron and input, perform inference at rates of up to 85 k frames/second, and consume less than 200 mW. In summary, our work sets several new benchmarks for low-energy spiking network processing on analog neuromorphic hardware and paves the way for future on-chip learning algorithms.
NEMar 30, 2020
The Operating System of the Neuromorphic BrainScaleS-1 SystemEric Müller, Sebastian Schmitt, Christian Mauch et al.
BrainScaleS-1 is a wafer-scale mixed-signal accelerated neuromorphic system targeted for research in the fields of computational neuroscience and beyond-von-Neumann computing. The BrainScaleS Operating System (BrainScaleS OS) is a software stack giving users the possibility to emulate networks described in the high-level network description language PyNN with minimal knowledge of the system. At the same time, expert usage is facilitated by allowing to hook into the system at any depth of the stack. We present operation and development methodologies implemented for the BrainScaleS-1 neuromorphic architecture and walk through the individual components of BrainScaleS OS constituting the software stack for BrainScaleS-1 platform operation.
NEMar 26, 2020
Accelerated Analog Neuromorphic ComputingJohannes Schemmel, Sebastian Billaudelle, Phillip Dauer et al.
This paper presents the concepts behind the BrainScales (BSS) accelerated analog neuromorphic computing architecture. It describes the second-generation BrainScales-2 (BSS-2) version and its most recent in-silico realization, the HICANN-X Application Specific Integrated Circuit (ASIC), as it has been developed as part of the neuromorphic computing activities within the European Human Brain Project (HBP). While the first generation is implemented in an 180nm process, the second generation uses 65nm technology. This allows the integration of a digital plasticity processing unit, a highly-parallel micro processor specially built for the computational needs of learning in an accelerated analog neuromorphic systems. The presented architecture is based upon a continuous-time, analog, physical model implementation of neurons and synapses, resembling an analog neuromorphic accelerator attached to build-in digital compute cores. While the analog part emulates the spike-based dynamics of the neural network in continuous-time, the latter simulates biological processes happening on a slower time-scale, like structural and parameter changes. Compared to biological time-scales, the emulation is highly accelerated, i.e. all time-constants are several orders of magnitude smaller than in biology. Programmable ion channel emulation and inter-compartmental conductances allow the modeling of nonlinear dendrites, back-propagating action-potentials as well as NMDA and Calcium plateau potentials. To extend the usability of the analog accelerator, it also supports vector-matrix multiplication. Thereby, BSS-2 supports inference of deep convolutional networks as well as local-learning with complex ensembles of spiking neurons within the same substrate.
ARMar 25, 2020
Verification and Design Methods for the BrainScaleS Neuromorphic Hardware SystemAndreas Grübl, Sebastian Billaudelle, Benjamin Cramer et al.
This paper presents verification and implementation methods that have been developed for the design of the BrainScaleS-2 65nm ASICs. The 2nd generation BrainScaleS chips are mixed-signal devices with tight coupling between full-custom analog neuromorphic circuits and two general purpose microprocessors (PPU) with SIMD extension for on-chip learning and plasticity. Simulation methods for automated analysis and pre-tapeout calibration of the highly parameterizable analog neuron and synapse circuits and for hardware-software co-development of the digital logic and software stack are presented. Accelerated operation of neuromorphic circuits and highly-parallel digital data buses between the full-custom neuromorphic part and the PPU require custom methodologies to close the digital signal timing at the interfaces. Novel extensions to the standard digital physical implementation design flow are highlighted. We present early results from the first full-size BrainScaleS-2 ASIC containing 512 neurons and 130K synapses, demonstrating the successful application of these methods. An application example illustrates the full functionality of the BrainScaleS-2 hybrid plasticity architecture.
NCDec 30, 2019
Versatile emulation of spiking neural networks on an accelerated neuromorphic substrateSebastian Billaudelle, Yannik Stradmann, Korbinian Schreiber et al.
We present first experimental results on the novel BrainScaleS-2 neuromorphic architecture based on an analog neuro-synaptic core and augmented by embedded microprocessors for complex plasticity and experiment control. The high acceleration factor of 1000 compared to biological dynamics enables the execution of computationally expensive tasks, by allowing the fast emulation of long-duration experiments or rapid iteration over many consecutive trials. The flexibility of our architecture is demonstrated in a suite of five distinct experiments, which emphasize different aspects of the BrainScaleS-2 system.
NCDec 27, 2019
Structural plasticity on an accelerated analog neuromorphic hardware systemSebastian Billaudelle, Benjamin Cramer, Mihai A. Petrovici et al.
In computational neuroscience, as well as in machine learning, neuromorphic devices promise an accelerated and scalable alternative to neural network simulations. Their neural connectivity and synaptic capacity depends on their specific design choices, but is always intrinsically limited. Here, we present a strategy to achieve structural plasticity that optimizes resource allocation under these constraints by constantly rewiring the pre- and gpostsynaptic partners while keeping the neuronal fan-in constant and the connectome sparse. In particular, we implemented this algorithm on the analog neuromorphic system BrainScaleS-2. It was executed on a custom embedded digital processor located on chip, accompanying the mixed-signal substrate of spiking neurons and synapse circuits. We evaluated our implementation in a simple supervised learning scenario, showing its ability to optimize the network topology with respect to the nature of its training data, as well as its overall computational efficiency.
NEDec 24, 2019
Fast and energy-efficient neuromorphic deep learning with first-spike timesJulian Göltz, Laura Kriener, Andreas Baumbach et al.
For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems are optimized for short time-to-solution and low energy-to-solution characteristics. At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. With time-to-first-spike coding both of these goals are inherently emerging features of learning. Here, we describe a rigorous derivation of a learning rule for such first-spike times in networks of leaky integrate-and-fire neurons, relying solely on input and output spike times, and show how this mechanism can implement error backpropagation in hierarchical spiking networks. Furthermore, we emulate our framework on the BrainScaleS-2 neuromorphic system and demonstrate its capability of harnessing the system's speed and energy characteristics. Finally, we examine how our approach generalizes to other neuromorphic platforms by studying how its performance is affected by typical distortive effects induced by neuromorphic substrates.
NENov 8, 2018
Demonstrating Advantages of Neuromorphic Computation: A Pilot StudyTimo Wunderlich, Akos F. Kungl, Eric Müller et al.
Neuromorphic devices represent an attempt to mimic aspects of the brain's architecture and dynamics with the aim of replicating its hallmark functional capabilities in terms of computational power, robust learning and energy efficiency. We employ a single-chip prototype of the BrainScaleS 2 neuromorphic system to implement a proof-of-concept demonstration of reward-modulated spike-timing-dependent plasticity in a spiking network that learns to play the Pong video game by smooth pursuit. This system combines an electronic mixed-signal substrate for emulating neuron and synapse dynamics with an embedded digital processor for on-chip learning, which in this work also serves to simulate the virtual environment and learning agent. The analog emulation of neuronal membrane dynamics enables a 1000-fold acceleration with respect to biological real-time, with the entire chip operating on a power budget of 57mW. Compared to an equivalent simulation using state-of-the-art software, the on-chip emulation is at least one order of magnitude faster and three orders of magnitude more energy-efficient. We demonstrate how on-chip learning can mitigate the effects of fixed-pattern noise, which is unavoidable in analog substrates, while making use of temporal variability for action exploration. Learning compensates imperfections of the physical substrate, as manifested in neuronal parameter variability, by adapting synaptic weights to match respective excitability of individual neurons.
NCMay 8, 2015
Porting HTM Models to the Heidelberg Neuromorphic Computing PlatformSebastian Billaudelle, Subutai Ahmad
Hierarchical Temporal Memory (HTM) is a computational theory of machine intelligence based on a detailed study of the neocortex. The Heidelberg Neuromorphic Computing Platform, developed as part of the Human Brain Project (HBP), is a mixed-signal (analog and digital) large-scale platform for modeling networks of spiking neurons. In this paper we present the first effort in porting HTM networks to this platform. We describe a framework for simulating key HTM operations using spiking network models. We then describe specific spatial pooling and temporal memory implementations, as well as simulations demonstrating that the fundamental properties are maintained. We discuss issues in implementing the full set of plasticity rules using Spike-Timing Dependent Plasticity (STDP), and rough place and route calculations. Although further work is required, our initial studies indicate that it should be possible to run large-scale HTM networks (including plasticity rules) efficiently on the Heidelberg platform. More generally the exercise of porting high level HTM algorithms to biophysical neuron models promises to be a fruitful area of investigation for future studies.