Andreas Grübl

12papers

464citations

Novelty34%

AI Score22

Ranked #184,341 of 201,326 authors (top 92%)#868 in NE (top 76%)

12 Papers

ARFeb 24, 2022

Demonstrating BrainScaleS-2 Inter-Chip Pulse-Communication using EXTOLL

Tobias Thommes, Sven Bordukat, Andreas Grübl et al.

The BrainScaleS-2 (BSS-2) Neuromorphic Computing System currently consists of multiple single-chip setups, which are connected to a compute cluster via Gigabit-Ethernet network technology. This is convenient for small experiments, where the neural networks fit into a single chip. When modeling networks of larger size, neurons have to be connected across chip boundaries. We implement these connections for BSS-2 using the EXTOLL networking technology. This provides high bandwidths and low latencies, as well as high message rates. Here, we describe the targeted pulse-routing implementation and required extensions to the BSS-2 software stack. We as well demonstrate feed-forward pulse-routing on BSS-2 using a scaled-down version without temporal merging.

ARNov 30, 2021

BrainScaleS Large Scale Spike Communication using Extoll

Tobias Thommes, Niels Buwen, Andreas Grübl et al.

The BrainScaleS Neuromorphic Computing System is currently connected to a compute cluster via Gigabit-Ethernet network technology. This is convenient for the currently used experiment mode, where neuronal networks cover at most one wafer module. When modelling networks of larger size, as for example a full sized cortical microcircuit model, one has to think about connecting neurons across wafer modules to larger networks. This can be done, using the Extoll networking technology, which provides high bandwidth and low latencies, as well as a low overhead packet protocol format.

NEJun 23, 2020

Inference with Artificial Neural Networks on Analog Neuromorphic Hardware

Johannes Weis, Philipp Spilger, Sebastian Billaudelle et al.

The neuromorphic BrainScaleS-2 ASIC comprises mixed-signal neurons and synapse circuits as well as two versatile digital microprocessors. Primarily designed to emulate spiking neural networks, the system can also operate in a vector-matrix multiplication and accumulation mode for artificial neural networks. Analog multiplication is carried out in the synapse circuits, while the results are accumulated on the neurons' membrane capacitors. Designed as an analog, in-memory computing device, it promises high energy efficiency. Fixed-pattern noise and trial-to-trial variations, however, require the implemented networks to cope with a certain level of perturbations. Further limitations are imposed by the digital resolution of the input values (5 bit), matrix weights (6 bit) and resulting neuron activations (8 bit). In this paper, we discuss BrainScaleS-2 as an analog inference accelerator and present calibration as well as optimization strategies, highlighting the advantages of training with hardware in the loop. Among other benchmarks, we classify the MNIST handwritten digits dataset using a two-dimensional convolution and two dense layers. We reach 98.0% test accuracy, closely matching the performance of the same network evaluated in software.

NEJun 12, 2020

Surrogate gradients for analog neuromorphic computing

Benjamin Cramer, Sebastian Billaudelle, Simeon Kanya et al.

To rapidly process temporal information at a low metabolic cost, biological neurons integrate inputs as an analog sum but communicate with spikes, binary events in time. Analog neuromorphic hardware uses the same principles to emulate spiking neural networks with exceptional energy-efficiency. However, instantiating high-performing spiking networks on such hardware remains a significant challenge due to device mismatch and the lack of efficient training algorithms. Here, we introduce a general in-the-loop learning framework based on surrogate gradients that resolves these issues. Using the BrainScaleS-2 neuromorphic system, we show that learning self-corrects for device mismatch resulting in competitive spiking network performance on both vision and speech benchmarks. Our networks display sparse spiking activity with, on average, far less than one spike per hidden neuron and input, perform inference at rates of up to 85 k frames/second, and consume less than 200 mW. In summary, our work sets several new benchmarks for low-energy spiking network processing on analog neuromorphic hardware and paves the way for future on-chip learning algorithms.

NEMar 30, 2020

The Operating System of the Neuromorphic BrainScaleS-1 System

Eric Müller, Sebastian Schmitt, Christian Mauch et al.

BrainScaleS-1 is a wafer-scale mixed-signal accelerated neuromorphic system targeted for research in the fields of computational neuroscience and beyond-von-Neumann computing. The BrainScaleS Operating System (BrainScaleS OS) is a software stack giving users the possibility to emulate networks described in the high-level network description language PyNN with minimal knowledge of the system. At the same time, expert usage is facilitated by allowing to hook into the system at any depth of the stack. We present operation and development methodologies implemented for the BrainScaleS-1 neuromorphic architecture and walk through the individual components of BrainScaleS OS constituting the software stack for BrainScaleS-1 platform operation.

ARMar 25, 2020

Verification and Design Methods for the BrainScaleS Neuromorphic Hardware System

Andreas Grübl, Sebastian Billaudelle, Benjamin Cramer et al.

This paper presents verification and implementation methods that have been developed for the design of the BrainScaleS-2 65nm ASICs. The 2nd generation BrainScaleS chips are mixed-signal devices with tight coupling between full-custom analog neuromorphic circuits and two general purpose microprocessors (PPU) with SIMD extension for on-chip learning and plasticity. Simulation methods for automated analysis and pre-tapeout calibration of the highly parameterizable analog neuron and synapse circuits and for hardware-software co-development of the digital logic and software stack are presented. Accelerated operation of neuromorphic circuits and highly-parallel digital data buses between the full-custom neuromorphic part and the PPU require custom methodologies to close the digital signal timing at the interfaces. Novel extensions to the standard digital physical implementation design flow are highlighted. We present early results from the first full-size BrainScaleS-2 ASIC containing 512 neurons and 130K synapses, demonstrating the successful application of these methods. An application example illustrates the full functionality of the BrainScaleS-2 hybrid plasticity architecture.

NCDec 30, 2019

Versatile emulation of spiking neural networks on an accelerated neuromorphic substrate

Sebastian Billaudelle, Yannik Stradmann, Korbinian Schreiber et al.

We present first experimental results on the novel BrainScaleS-2 neuromorphic architecture based on an analog neuro-synaptic core and augmented by embedded microprocessors for complex plasticity and experiment control. The high acceleration factor of 1000 compared to biological dynamics enables the execution of computationally expensive tasks, by allowing the fast emulation of long-duration experiments or rapid iteration over many consecutive trials. The flexibility of our architecture is demonstrated in a suite of five distinct experiments, which emphasize different aspects of the BrainScaleS-2 system.

NENov 8, 2018

Demonstrating Advantages of Neuromorphic Computation: A Pilot Study

Timo Wunderlich, Akos F. Kungl, Eric Müller et al.

Neuromorphic devices represent an attempt to mimic aspects of the brain's architecture and dynamics with the aim of replicating its hallmark functional capabilities in terms of computational power, robust learning and energy efficiency. We employ a single-chip prototype of the BrainScaleS 2 neuromorphic system to implement a proof-of-concept demonstration of reward-modulated spike-timing-dependent plasticity in a spiking network that learns to play the Pong video game by smooth pursuit. This system combines an electronic mixed-signal substrate for emulating neuron and synapse dynamics with an embedded digital processor for on-chip learning, which in this work also serves to simulate the virtual environment and learning agent. The analog emulation of neuronal membrane dynamics enables a 1000-fold acceleration with respect to biological real-time, with the entire chip operating on a power budget of 57mW. Compared to an equivalent simulation using state-of-the-art software, the on-chip emulation is at least one order of magnitude faster and three orders of magnitude more energy-efficient. We demonstrate how on-chip learning can mitigate the effects of fixed-pattern noise, which is unavoidable in analog substrates, while making use of temporal variability for action exploration. Learning compensates imperfections of the physical substrate, as manifested in neuronal parameter variability, by adapting synaptic weights to match respective excitability of individual neurons.

NEJul 6, 2018

Accelerated physical emulation of Bayesian inference in spiking neural networks

Akos F. Kungl, Sebastian Schmitt, Johann Klähn et al.

The massively parallel nature of biological information processing plays an important role for its superiority to human-engineered computing devices. In particular, it may hold the key to overcoming the von Neumann bottleneck that limits contemporary computer architectures. Physical-model neuromorphic devices seek to replicate not only this inherent parallelism, but also aspects of its microscopic dynamics in analog circuits emulating neurons and synapses. However, these machines require network models that are not only adept at solving particular tasks, but that can also cope with the inherent imperfections of analog substrates. We present a spiking network model that performs Bayesian inference through sampling on the BrainScaleS neuromorphic platform, where we use it for generative and discriminative computations on visual data. By illustrating its functionality on this platform, we implicitly demonstrate its robustness to various substrate-specific distortive effects, as well as its accelerated capability for computation. These results showcase the advantages of brain-inspired physical computation and provide important building blocks for large-scale neuromorphic applications.

ETJan 15, 2018

Full Wafer Redistribution and Wafer Embedding as Key Technologies for a Multi-Scale Neuromorphic Hardware Cluster

Kai Zoschke, Maurice Güttler, Lars Böttcher et al.

Together with the Kirchhoff-Institute for Physics(KIP) the Fraunhofer IZM has developed a full wafer redistribution and embedding technology as base for a large-scale neuromorphic hardware system. The paper will give an overview of the neuromorphic computing platform at the KIP and the associated hardware requirements which drove the described technological developments. In the first phase of the project standard redistribution technologies from wafer level packaging were adapted to enable a high density reticle-to-reticle routing on 200mm CMOS wafers. Neighboring reticles were interconnected across the scribe lines with an 8μm pitch routing based on semi-additive copper metallization. Passivation by photo sensitive benzocyclobutene was used to enable a second intra-reticle routing layer. Final IO pads with flash gold were generated on top of each reticle. With that concept neuromorphic systems based on full wafers could be assembled and tested. The fabricated high density inter-reticle routing revealed a very high yield of larger than 99.9%. In order to allow an upscaling of the system size to a large number of wafers with feasible effort a full wafer embedding concept for printed circuit boards was developed and proven in the second phase of the project. The wafers were thinned to 250μm and laminated with additional prepreg layers and copper foils into a core material. After lamination of the PCB panel the reticle IOs of the embedded wafer were accessed by micro via drilling, copper electroplating, lithography and subtractive etching of the PCB wiring structure. The created wiring with 50um line width enabled an access of the reticle IOs on the embedded wafer as well as a board level routing. The panels with the embedded wafers were subsequently stressed with up to 1000 thermal cycles between 0C and 100C and have shown no severe failure formation over the cycle time.

NCMar 17, 2017

Pattern representation and recognition with accelerated analog neuromorphic systems

Mihai A. Petrovici, Sebastian Schmitt, Johann Klähn et al.

Despite being originally inspired by the central nervous system, artificial neural networks have diverged from their biological archetypes as they have been remodeled to fit particular tasks. In this paper, we review several possibilites to reverse map these architectures to biologically more realistic spiking networks with the aim of emulating them on fast, low-power neuromorphic hardware. Since many of these devices employ analog components, which cannot be perfectly controlled, finding ways to compensate for the resulting effects represents a key challenge. Here, we discuss three different strategies to address this problem: the addition of auxiliary network components for stabilizing activity, the utilization of inherently robust architectures and a training method for hardware-emulated networks that functions without perfect knowledge of the system's dynamics and parameters. For all three scenarios, we corroborate our theoretical considerations with experimental results on accelerated analog neuromorphic platforms.

NCMar 12, 2017

Robustness from structure: Inference with hierarchical spiking networks on analog neuromorphic hardware

Mihai A. Petrovici, Anna Schroeder, Oliver Breitwieser et al.

How spiking networks are able to perform probabilistic inference is an intriguing question, not only for understanding information processing in the brain, but also for transferring these computational principles to neuromorphic silicon circuits. A number of computationally powerful spiking network models have been proposed, but most of them have only been tested, under ideal conditions, in software simulations. Any implementation in an analog, physical system, be it in vivo or in silico, will generally lead to distorted dynamics due to the physical properties of the underlying substrate. In this paper, we discuss several such distortive effects that are difficult or impossible to remove by classical calibration routines or parameter training. We then argue that hierarchical networks of leaky integrate-and-fire neurons can offer the required robustness for physical implementation and demonstrate this with both software simulations and emulation on an accelerated analog neuromorphic device.