Johannes Partzsch

h-index21

7papers

125citations

Novelty37%

AI Score36

Ranked #98,092 of 194,257 authors (top 50%)#300 in NE (top 28%)

7 Papers

6.6LGApr 10, 2023

Deploying Machine Learning Models to Ahead-of-Time Runtime on Edge Using MicroTVM

Chen Liu, Matthias Jobst, Liyuan Guo et al.

In the past few years, more and more AI applications have been applied to edge devices. However, models trained by data scientists with machine learning frameworks, such as PyTorch or TensorFlow, can not be seamlessly executed on edge. In this paper, we develop an end-to-end code generator parsing a pre-trained model to C source libraries for the backend using MicroTVM, a machine learning compiler framework extension addressing inference on bare metal devices. An analysis shows that specific compute-intensive operators can be easily offloaded to the dedicated accelerator with a Universal Modular Accelerator (UMA) interface, while others are processed in the CPU cores. By using the automatically generated ahead-of-time C runtime, we conduct a hand gesture recognition experiment on an ARM Cortex M4F core.

6.9ETMar 25

Characterization of Off-wafer Pulse Communication in BrainScaleS Neuromorphic System

Bernhard Vogginger, Vasilis Thanasoulis, Johannes Partzsch et al.

Neuromorphic VLSI systems take inspiration from biology to enable efficient emulation of large-scale spiking neural networks and to explore new computational paradigms. To establish large neuromorphic systems, a sophisticated routing infrastructure is needed to communicate spikes between chips and to/from the host computer. For the BrainScaleS wafer-scale neuromorphic system considered in this work, especially the stimulation with input spikes and the recording of spikes is demanding, requiring high bandwidth and temporal resolution due to the accelerated emulation of neural dynamics 10.000 faster than biological real time. Here, we present a systematic characterization of the BrainScaleS off-wafer communication infrastructure implemented around Kintex7 FPGAs. The communication flow is characterized in terms of throughput, transmission delay, jitter and pulse loss. Further, we analyze the effect of the communication distortions (like pulse loss and jitter) on a neural benchmark model with highly varying spike activity. The presented methods and techniques for communication evaluation are general applicable and provide useful insights for the mapping of network models to the hardware such as the distribution of input spikes across communication channels.

13.0LGApr 9, 2025

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation

Sirine Arfa, Bernhard Vogginger, Chen Liu et al.

Spiking Neural Networks (SNNs) are highly energy-efficient during inference, making them particularly suitable for deployment on neuromorphic hardware. Their ability to process event-driven inputs, such as data from dynamic vision sensors (DVS), further enhances their applicability to edge computing tasks. However, the resource constraints of edge hardware necessitate techniques like weight quantization, which reduce the memory footprint of SNNs while preserving accuracy. Despite its importance, existing quantization methods typically focus on synaptic weights quantization without taking account of other critical parameters, such as scaling neuron firing thresholds. To address this limitation, we present the first benchmark for the DVS gesture recognition task using SNNs optimized for the many-core neuromorphic chip SpiNNaker2. Our study evaluates two quantization pipelines for fixed-point computations. The first approach employs post training quantization (PTQ) with percentile-based threshold scaling, while the second uses quantization aware training (QAT) with adaptive threshold scaling. Both methods achieve accurate 8-bit on-chip inference, closely approximating 32-bit floating-point performance. Additionally, our baseline SNNs perform competitively against previously reported results without specialized techniques. These models are deployed on SpiNNaker2 using the neuromorphic intermediate representation (NIR). Ultimately, we achieve 94.13% classification accuracy on-chip, demonstrating the SpiNNaker2's potential for efficient, low-energy neuromorphic computing.

7.5NESep 18, 2020

Low-Power Low-Latency Keyword Spotting and Adaptive Control with a SpiNNaker 2 Prototype and Comparison with Loihi

Yexin Yan, Terrence C. Stewart, Xuan Choo et al.

We implemented two neural network based benchmark tasks on a prototype chip of the second-generation SpiNNaker (SpiNNaker 2) neuromorphic system: keyword spotting and adaptive robotic control. Keyword spotting is commonly used in smart speakers to listen for wake words, and adaptive control is used in robotic applications to adapt to unknown dynamics in an online fashion. We highlight the benefit of a multiply accumulate (MAC) array in the SpiNNaker 2 prototype which is ordinarily used in rate-based machine learning networks when employed in a neuromorphic, spiking context. In addition, the same benchmark tasks have been implemented on the Loihi neuromorphic chip, giving a side-by-side comparison regarding power consumption and computation time. While Loihi shows better efficiency when less complicated vector-matrix multiplication is involved, with the MAC array, the SpiNNaker 2 prototype shows better efficiency when high dimensional vector-matrix multiplication is involved.

9.9NEMar 30, 2020

The Operating System of the Neuromorphic BrainScaleS-1 System

Eric Müller, Sebastian Schmitt, Christian Mauch et al.

BrainScaleS-1 is a wafer-scale mixed-signal accelerated neuromorphic system targeted for research in the fields of computational neuroscience and beyond-von-Neumann computing. The BrainScaleS Operating System (BrainScaleS OS) is a software stack giving users the possibility to emulate networks described in the high-level network description language PyNN with minimal knowledge of the system. At the same time, expert usage is facilitated by allowing to hook into the system at any depth of the stack. We present operation and development methodologies implemented for the BrainScaleS-1 neuromorphic architecture and walk through the individual components of BrainScaleS OS constituting the software stack for BrainScaleS-1 platform operation.

8.1NEMar 21, 2019

Dynamic Power Management for Neuromorphic Many-Core Systems

Sebastian Hoeppner, Bernhard Vogginger, Yexin Yan et al.

This work presents a dynamic power management architecture for neuromorphic many core systems such as SpiNNaker. A fast dynamic voltage and frequency scaling (DVFS) technique is presented which allows the processing elements (PE) to change their supply voltage and clock frequency individually and autonomously within less than 100 ns. This is employed by the neuromorphic simulation software flow, which defines the performance level (PL) of the PE based on the actual workload within each simulation cycle. A test chip in 28 nm SLP CMOS technology has been implemented. It includes 4 PEs which can be scaled from 0.7 V to 1.0 V with frequencies from 125 MHz to 500 MHz at three distinct PLs. By measurement of three neuromorphic benchmarks it is shown that the total PE power consumption can be reduced by 75%, with 80% baseline power reduction and a 50% reduction of energy per neuron and synapse computation, all while maintaining temporary peak system performance to achieve biological real-time operation of the system. A numerical model of this power management model is derived which allows DVFS architecture exploration for neuromorphics. The proposed technique is to be used for the second generation SpiNNaker neuromorphic many core system.

8.6NCMar 17, 2017

Pattern representation and recognition with accelerated analog neuromorphic systems

Mihai A. Petrovici, Sebastian Schmitt, Johann Klähn et al.

Despite being originally inspired by the central nervous system, artificial neural networks have diverged from their biological archetypes as they have been remodeled to fit particular tasks. In this paper, we review several possibilites to reverse map these architectures to biologically more realistic spiking networks with the aim of emulating them on fast, low-power neuromorphic hardware. Since many of these devices employ analog components, which cannot be perfectly controlled, finding ways to compensate for the resulting effects represents a key challenge. Here, we discuss three different strategies to address this problem: the addition of auxiliary network components for stabilizing activity, the utilization of inherently robust architectures and a training method for hardware-emulated networks that functions without perfect knowledge of the system's dynamics and parameters. For all three scenarios, we corroborate our theoretical considerations with experimental results on accelerated analog neuromorphic platforms.