AIApr 10, 2023
NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and SystemsJason Yik, Korneel Van den Berghe, Douwe den Blanken et al. · eth-zurich
Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of researchers across industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we outline tasks and guidelines for benchmarks across multiple application domains, and present initial performance baselines across neuromorphic and conventional approaches for both benchmark tracks. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community.
NESep 20, 2023
Limitations in odour recognition and generalisation in a neuromorphic olfactory circuitNik Dennler, André van Schaik, Michael Schmuker
Neuromorphic computing is one of the few current approaches that have the potential to significantly reduce power consumption in Machine Learning and Artificial Intelligence. Imam & Cleland presented an odour-learning algorithm that runs on a neuromorphic architecture and is inspired by circuits described in the mammalian olfactory bulb. They assess the algorithm's performance in "rapid online learning and identification" of gaseous odorants and odorless gases (short "gases") using a set of gas sensor recordings of different odour presentations and corrupting them by impulse noise. We replicated parts of the study and discovered limitations that affect some of the conclusions drawn. First, the dataset used suffers from sensor drift and a non-randomised measurement protocol, rendering it of limited use for odour identification benchmarks. Second, we found that the model is restricted in its ability to generalise over repeated presentations of the same gas. We demonstrate that the task the study refers to can be solved with a simple hash table approach, matching or exceeding the reported results in accuracy and runtime. Therefore, a validation of the model that goes beyond restoring a learned data sample remains to be shown, in particular its suitability to odour identification tasks.
LGOct 20, 2023
An Event based Prediction Suffix TreeEvie Andrew, Travis Monk, André van Schaik
This article introduces the Event based Prediction Suffix Tree (EPST), a biologically inspired, event-based prediction algorithm. The EPST learns a model online based on the statistics of an event based input and can make predictions over multiple overlapping patterns. The EPST uses a representation specific to event based data, defined as a portion of the power set of event subsequences within a short context window. It is explainable, and possesses many promising properties such as fault tolerance, resistance to event noise, as well as the capability for one-shot learning. The computational features of the EPST are examined in a synthetic data prediction task with additive event noise, event jitter, and dropout. The resulting algorithm outputs predicted projections for the near term future of the signal, which may be applied to tasks such as event based anomaly detection or pattern recognition.
ARApr 9, 2025
Neural Signal Compression using RAMAN tinyML Accelerator for BCI ApplicationsAdithya Krishna, Sohan Debnath, Madhuvanthi Srivatsav et al.
High-quality, multi-channel neural recording is indispensable for neuroscience research and clinical applications. Large-scale brain recordings often produce vast amounts of data that must be wirelessly transmitted for subsequent offline analysis and decoding, especially in brain-computer interfaces (BCIs) utilizing high-density intracortical recordings with hundreds or thousands of electrodes. However, transmitting raw neural data presents significant challenges due to limited communication bandwidth and resultant excessive heating. To address this challenge, we propose a neural signal compression scheme utilizing Convolutional Autoencoders (CAEs), which achieves a compression ratio of up to 150 for compressing local field potentials (LFPs). The CAE encoder section is implemented on RAMAN, an energy-efficient tinyML accelerator designed for edge computing. RAMAN leverages sparsity in activation and weights through zero skipping, gating, and weight compression techniques. Additionally, we employ hardware-software co-optimization by pruning the CAE encoder model parameters using a hardware-aware balanced stochastic pruning strategy, resolving workload imbalance issues and eliminating indexing overhead to reduce parameter storage requirements by up to 32.4%. Post layout simulation shows that the RAMAN encoder can be implemented in a TSMC 65-nm CMOS process, occupying a core area of 0.0187 mm2 per channel. Operating at a clock frequency of 2 MHz and a supply voltage of 1.2 V, the estimated power consumption is 15.1 uW per channel for the proposed DS-CAE1 model. For functional validation, the RAMAN encoder was also deployed on an Efinix Ti60 FPGA, utilizing 37.3k LUTs and 8.6k flip-flops. The compressed neural data from RAMAN is reconstructed offline with SNDR of 22.6 dB and 27.4 dB, along with R2 scores of 0.81 and 0.94, respectively, evaluated on two monkey neural recordings.
NEMay 31, 2023
Efficient Implementation of a Multi-Layer Gradient-Free Online-Trainable Spiking Neural Network on FPGAAli Mehrabi, Yeshwanth Bethi, André van Schaik et al.
This paper presents an efficient hardware implementation of the recently proposed Optimized Deep Event-driven Spiking Neural Network Architecture (ODESA). ODESA is the first network to have end-to-end multi-layer online local supervised training without using gradients and has the combined adaptation of weights and thresholds in an efficient hierarchical structure. This research shows that the network architecture and the online training of weights and thresholds can be implemented efficiently on a large scale in hardware. The implementation consists of a multi-layer Spiking Neural Network (SNN) and individual training modules for each layer that enable online self-learning without using back-propagation. By using simple local adaptive selection thresholds, a Winner-Takes-All (WTA) constraint on each layer, and a modified weight update rule that is more amenable to hardware, the trainer module allocates neuronal resources optimally at each layer without having to pass high-precision error measurements across layers. All elements in the system, including the training module, interact using event-based binary spikes. The hardware-optimized implementation is shown to preserve the performance of the original algorithm across multiple spatial-temporal classification problems with significantly reduced hardware requirements.
CVMay 13, 2021
Superevents: Towards Native Semantic Segmentation for Event-based CamerasWeng Fei Low, Ankit Sonthalia, Zhi Gao et al.
Most successful computer vision models transform low-level features, such as Gabor filter responses, into richer representations of intermediate or mid-level complexity for downstream visual tasks. These mid-level representations have not been explored for event cameras, although it is especially relevant to the visually sparse and often disjoint spatial information in the event stream. By making use of locally consistent intermediate representations, termed as superevents, numerous visual tasks ranging from semantic segmentation, visual tracking, depth estimation shall benefit. In essence, superevents are perceptually consistent local units that delineate parts of an object in a scene. Inspired by recent deep learning architectures, we present a novel method that employs lifetime augmentation for obtaining an event stream representation that is fed to a fully convolutional network to extract superevents. Our qualitative and quantitative experimental results on several sequences of a benchmark dataset highlights the significant potential for event-based downstream applications.
ROOct 22, 2020
Source localization using particle filtering on FPGA for robotic navigation with imprecise binary measurementAdithya Krishna, André van Schaik, Chetan Singh Thakur
Particle filtering is a recursive Bayesian estimation technique that has gained popularity recently for tracking and localization applications. It uses Monte Carlo simulation and has proven to be a very reliable technique to model non-Gaussian and non-linear elements of physical systems. Particle filters outperform various other traditional filters like Kalman filters in non-Gaussian and non-linear settings due to their non-analytical and non-parametric nature. However, a significant drawback of particle filters is their computational complexity, which inhibits their use in real-time applications with conventional CPU or DSP based implementation schemes. This paper proposes a modification to the existing particle filter algorithm and presents a highspeed and dedicated hardware architecture. The architecture incorporates pipelining and parallelization in the design to reduce execution time considerably. The design is validated for a source localization problem wherein we estimate the position of a source in real-time using the particle filter algorithm implemented on hardware. The validation setup relies on an Unmanned Ground Vehicle (UGV) with a photodiode housing on top to sense and localize a light source. We have prototyped the design using Artix-7 field-programmable gate array (FPGA), and resource utilization for the proposed system is presented. Further, we show the execution time and estimation accuracy of the high-speed architecture and observe a significant reduction in computational time. Our implementation of particle filters on FPGA is scalable and modular, with a low execution time of about 5.62 us for processing 1024 particles and can be deployed for real-time applications.
NEJul 18, 2019
Event-based Feature Extraction Using Adaptive Selection ThresholdsSaeed Afshar, Ying Xu, Jonathan Tapson et al.
Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage.
NEMay 23, 2018
Large-Scale Neuromorphic Spiking Array Processors: A quest to mimic the brainChetan Singh Thakur, Jamal Molin, Gert Cauwenberghs et al.
Neuromorphic engineering (NE) encompasses a diverse range of approaches to information processing that are inspired by neurobiological systems, and this feature distinguishes neuromorphic systems from conventional computing systems. The brain has evolved over billions of years to solve difficult engineering problems by using efficient, parallel, low-power computation. The goal of NE is to design systems capable of brain-like computation. Numerous large-scale neuromorphic projects have emerged recently. This interdisciplinary field was listed among the top 10 technology breakthroughs of 2014 by the MIT Technology Review and among the top 10 emerging technologies of 2015 by the World Economic Forum. NE has two-way goals: one, a scientific goal to understand the computational properties of biological neural systems by using models implemented in integrated circuits (ICs); second, an engineering goal to exploit the known properties of biological systems to design and implement efficient devices for engineering applications. Building hardware neural emulators can be extremely useful for simulating large-scale neural models to explain how intelligent behavior arises in the brain. The principle advantages of neuromorphic emulators are that they are highly energy efficient, parallel and distributed, and require a small silicon area. Thus, compared to conventional CPUs, these neuromorphic emulators are beneficial in many engineering applications such as for the porting of deep learning algorithms for various recognitions tasks. In this review article, we describe some of the most significant neuromorphic spiking emulators, compare the different architectures and approaches used by them, illustrate their advantages and drawbacks, and highlight the capabilities that each can deliver to neural modelers.
CVFeb 17, 2017
EMNIST: an extension of MNIST to handwritten lettersGregory Cohen, Saeed Afshar, Jonathan Tapson et al.
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.
NEMar 13, 2016
A Stochastic Approach to STDPRunchun Wang, Chetan Singh Thakur, Tara Julia Hamilton et al.
We present a digital implementation of the Spike Timing Dependent Plasticity (STDP) learning rule. The proposed digital implementation consists of an exponential decay generator array and a STDP adaptor array. On the arrival of a pre- and post-synaptic spike, the STDP adaptor will send a digital spike to the decay generator. The decay generator will then generate an exponential decay, which will be used by the STDP adaptor to perform the weight adaption. The exponential decay, which is computational expensive, is efficiently implemented by using a novel stochastic approach, which we analyse and characterise here. We use a time multiplexing approach to achieve 8192 (8k) virtual STDP adaptors and decay generators with only one physical implementation of each. We have validated our stochastic STDP approach with measurement results of a balanced excitation/inhibition experiment. Our stochastic approach is ideal for implementing the STDP learning rule in large-scale spiking neural networks running in real time.
NEMar 2, 2015
A neuromorphic hardware framework based on population codingChetan Singh Thakur, Tara Julia Hamilton, Runchun Wang et al.
In the biological nervous system, large neuronal populations work collaboratively to encode sensory stimuli. These neuronal populations are characterised by a diverse distribution of tuning curves, ensuring that the entire range of input stimuli is encoded. Based on these principles, we have designed a neuromorphic system called a Trainable Analogue Block (TAB), which encodes given input stimuli using a large population of neurons with a heterogeneous tuning curve profile. Heterogeneity of tuning curves is achieved using random device mismatches in VLSI (Very Large Scale Integration) process and by adding a systematic offset to each hidden neuron. Here, we present measurement results of a single test cell fabricated in a 65nm technology to verify the TAB framework. We have mimicked a large population of neurons by re-using measurement results from the test cell by varying offset. We thus demonstrate the learning capability of the system for various regression tasks. The TAB system may pave the way to improve the design of analogue circuits for commercial applications, by rendering circuits insensitive to random mismatch that arises due to the manufacturing process.
NEMar 2, 2015
FPGA Implementation of the CAR Model of the CochleaChetan Singh Thakur, Tara Julia Hamilton, Jonathan Tapson et al.
The front end of the human auditory system, the cochlea, converts sound signals from the outside world into neural impulses transmitted along the auditory pathway for further processing. The cochlea senses and separates sound in a nonlinear active fashion, exhibiting remarkable sensitivity and frequency discrimination. Although several electronic models of the cochlea have been proposed and implemented, none of these are able to reproduce all the characteristics of the cochlea, including large dynamic range, large gain and sharp tuning at low sound levels, and low gain and broad tuning at intense sound levels. Here, we implement the Cascade of Asymmetric Resonators (CAR) model of the cochlea on an FPGA. CAR represents the basilar membrane filter in the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. CAR-FAC is a neuromorphic model of hearing based on a pole-zero filter cascade model of auditory filtering. It uses simple nonlinear extensions of conventional digital filter stages that are well suited to FPGA implementations, so that we are able to implement up to 1224 cochlear sections on Virtex-6 FPGA to process sound data in real time. The FPGA implementation of the electronic cochlea described here may be used as a front-end sound analyser for various machine-hearing applications.
NEDec 29, 2014
Fast, simple and accurate handwritten digit classification by training shallow neural network classifiers with the 'extreme learning machine' algorithmMark D. McDonnell, Migel D. Tissera, Tony Vladusich et al.
Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM) approach, which also enables a very rapid training time (~10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random `receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.
NEJun 12, 2014
Learning ELM network weights using linear discriminant analysisPhilip de Chazal, Jonathan Tapson, André van Schaik
We present an alternative to the pseudo-inverse method for determining the hidden to output weight values for Extreme Learning Machines performing classification tasks. The method is based on linear discriminant analysis and provides Bayes optimal single point estimates for the weight values.
NEJun 11, 2014
Explicit Computation of Input Weights in Extreme Learning MachinesJonathan Tapson, Philip de Chazal, André van Schaik
We present a closed form expression for initializing the input weights in a multi-layer perceptron, which can be used as the first step in synthesis of an Extreme Learning Ma-chine. The expression is based on the standard function for a separating hyperplane as computed in multilayer perceptrons and linear Support Vector Machines; that is, as a linear combination of input data samples. In the absence of supervised training for the input weights, random linear combinations of training data samples are used to project the input data to a higher dimensional hidden layer. The hidden layer weights are solved in the standard ELM fashion by computing the pseudoinverse of the hidden layer outputs and multiplying by the desired output values. All weights for this method can be computed in a single pass, and the resulting networks are more accurate and more consistent on some standard problems than regular ELM networks of the same size.
NEMay 30, 2014
ELM Solutions for Event-Based SystemsJonathan Tapson, André van Schaik
Whilst most engineered systems use signals that are continuous in time, there is a domain of systems in which signals consist of events. Events, like Dirac delta functions, have no meaningful time duration. Many important real-world systems are intrinsically event-based, including the mammalian brain, in which the primary packets of data are spike events, or action potentials. In this domain, signal processing requires responses to spatio-temporal patterns of events. We show that some straightforward modifications to the standard ELM topology produce networks that are able to perform spatio-temporal event processing online with a high degree of accuracy. The modifications involve the re-definition of hidden layer units as synaptic kernels, in which the input delta functions are transformed into continuous-valued signals using a variety of impulse-response functions. This permits the use of linear solution methods in the output layer, which can produce events as output, if modeled as a classifier; the output classes are 'event' or 'no event'. We illustrate the method in application to a spike-processing problem.
NEMay 30, 2014
Online and Adaptive Pseudoinverse Solutions for ELM WeightsAndré van Schaik, Jonathan Tapson
The ELM method has become widely used for classification and regressions problems as a result of its accuracy, simplicity and ease of use. The solution of the hidden layer weights by means of a matrix pseudoinverse operation is a significant contributor to the utility of the method; however, the conventional calculation of the pseudoinverse by means of a singular value decomposition (SVD) is not always practical for large data sets or for online updates to the solution. In this paper we discuss incremental methods for solving the pseudoinverse which are suitable for ELM. We show that careful choice of methods allows us to optimize for accuracy, ease of computation, or adaptability of the solution.