Jonathan Tapson

NE
21papers
1,280citations
Novelty44%
AI Score25

21 Papers

NEAug 9, 2019
Biologically-inspired Salience Affected Artificial Neural Network (SANN)

Leendert A Remmelzwaal, George F R Ellis, Jonathan Tapson et al.

In this paper we introduce a novel Salience Affected Artificial Neural Network (SANN) that models the way neuromodulators such as dopamine and noradrenaline affect neural dynamics in the human brain by being distributed diffusely through neocortical regions, allowing both salience signals to modulate cognition immediately, and one time learning to take place through strengthening entire patterns of activation at one go. We present a model that is capable of one-time salience tagging in a neural network trained to classify objects, and returns a salience response during classification (inference). We explore the effects of salience on learning via its effect on the activation functions of each node, as well as on the strength of weights between nodes in the network. We demonstrate that salience tagging can improve classification confidence for both the individual image as well as the class of images it belongs to. We also show that the computation impact of producing a salience response is minimal. This research serves as a proof of concept, and could be the first step towards introducing salience tagging into Deep Learning Networks and robotics.

NEJul 18, 2019
Event-based Feature Extraction Using Adaptive Selection Thresholds

Saeed Afshar, Ying Xu, Jonathan Tapson et al.

Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage.

CVFeb 17, 2017
EMNIST: an extension of MNIST to handwritten letters

Gregory Cohen, Saeed Afshar, Jonathan Tapson et al.

The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.

NEMar 14, 2016
Investigation of event-based memory surfaces for high-speed tracking, unsupervised feature extraction and object recognition

Saeed Afshar, Gregory Cohen, Tara Julia Hamilton et al.

In this paper we compare event-based decaying and time based-decaying memory surfaces for high-speed eventbased tracking, feature extraction, and object classification using an event-based camera. The high-speed recognition task involves detecting and classifying model airplanes that are dropped free-hand close to the camera lens so as to generate a challenging dataset exhibiting significant variance in target velocity. This variance motivated the investigation of event-based decaying memory surfaces in comparison to time-based decaying memory surfaces to capture the temporal aspect of the event-based data. These surfaces are then used to perform unsupervised feature extraction, tracking and recognition. In order to generate the memory surfaces, event binning, linearly decaying kernels, and exponentially decaying kernels were investigated with exponentially decaying kernels found to perform best. Event-based decaying memory surfaces were found to outperform time-based decaying memory surfaces in recognition especially when invariance to target velocity was made a requirement. A range of network and receptive field sizes were investigated. The system achieves 98.75% recognition accuracy within 156 milliseconds of an airplane entering the field of view, using only twenty-five event-based feature extracting neurons in series with a linear classifier. By comparing the linear classifier results to an ELM classifier, we find that a small number of event-based feature extractors can effectively project the complex spatio-temporal event patterns of the dataset to an almost linearly separable representation in feature space.

NEMar 13, 2016
A Stochastic Approach to STDP

Runchun Wang, Chetan Singh Thakur, Tara Julia Hamilton et al.

We present a digital implementation of the Spike Timing Dependent Plasticity (STDP) learning rule. The proposed digital implementation consists of an exponential decay generator array and a STDP adaptor array. On the arrival of a pre- and post-synaptic spike, the STDP adaptor will send a digital spike to the decay generator. The decay generator will then generate an exponential decay, which will be used by the STDP adaptor to perform the weight adaption. The exponential decay, which is computational expensive, is efficiently implemented by using a novel stochastic approach, which we analyse and characterise here. We use a time multiplexing approach to achieve 8192 (8k) virtual STDP adaptors and decay generators with only one physical implementation of each. We have validated our stochastic STDP approach with measurement results of a balanced excitation/inhibition experiment. Our stochastic approach is ideal for implementing the STDP learning rule in large-scale spiking neural networks running in real time.

NESep 3, 2015
A Reconfigurable Mixed-signal Implementation of a Neuromorphic ADC

Ying Xu, Chetan Singh Thakur, Tara Julia Hamilton et al.

We present a neuromorphic Analogue-to-Digital Converter (ADC), which uses integrate-and-fire (I&F) neurons as the encoders of the analogue signal, with modulated inhibitions to decohere the neuronal spikes trains. The architecture consists of an analogue chip and a control module. The analogue chip comprises two scan chains and a twodimensional integrate-and-fire neuronal array. Individual neurons are accessed via the chains one by one without any encoder decoder or arbiter. The control module is implemented on an FPGA (Field Programmable Gate Array), which sends scan enable signals to the scan chains and controls the inhibition for individual neurons. Since the control module is implemented on an FPGA, it can be easily reconfigured. Additionally, we propose a pulse width modulation methodology for the lateral inhibition, which makes use of different pulse widths indicating different strengths of inhibition for each individual neuron to decohere neuronal spikes. Software simulations in this paper tested the robustness of the proposed ADC architecture to fixed random noise. A circuit simulation using ten neurons shows the performance and the feasibility of the architecture.

NESep 3, 2015
A compact aVLSI conductance-based silicon neuron

Runchun Wang, Chetan Singh Thakur, Tara Julia Hamilton et al.

We present an analogue Very Large Scale Integration (aVLSI) implementation that uses first-order lowpass filters to implement a conductance-based silicon neuron for high-speed neuromorphic systems. The aVLSI neuron consists of a soma (cell body) and a single synapse, which is capable of linearly summing both the excitatory and inhibitory postsynaptic potentials (EPSP and IPSP) generated by the spikes arriving from different sources. Rather than biasing the silicon neuron with different parameters for different spiking patterns, as is typically done, we provide digital control signals, generated by an FPGA, to the silicon neuron to obtain different spiking behaviours. The proposed neuron is only ~26.5 um2 in the IBM 130nm process and thus can be integrated at very high density. Circuit simulations show that this neuron can emulate different spiking behaviours observed in biological neurons.

NEJul 21, 2015
A neuromorphic hardware architecture using the Neural Engineering Framework for pattern recognition

Runchun Wang, Chetan Singh Thakur, Tara Julia Hamilton et al.

We present a hardware architecture that uses the Neural Engineering Framework (NEF) to implement large-scale neural networks on Field Programmable Gate Arrays (FPGAs) for performing pattern recognition in real time. NEF is a framework that is capable of synthesising large-scale cognitive systems from subnetworks. We will first present the architecture of the proposed neural network implemented using fixed-point numbers and demonstrate a routine that computes the decoding weights by using the online pseudoinverse update method (OPIUM) in a parallel and distributed manner. The proposed system is efficiently implemented on a compact digital neural core. This neural core consists of 64 neurons that are instantiated by a single physical neuron using a time-multiplexing approach. As a proof of concept, we combined 128 identical neural cores together to build a handwritten digit recognition system using the MNIST database and achieved a recognition rate of 96.55%. The system is implemented on a state-of-the-art FPGA and can process 5.12 million digits per second. The architecture is not limited to handwriting recognition, but is generally applicable as an extremely fast pattern recognition processor for various kinds of patterns such as speech and images.

NEJul 10, 2015
A Trainable Neuromorphic Integrated Circuit that Exploits Device Mismatch

Chetan Singh Thakur, Runchun Wang, Tara Julia Hamilton et al.

Random device mismatch that arises as a result of scaling of the CMOS (complementary metal-oxide semi-conductor) technology into the deep submicron regime degrades the accuracy of analogue circuits. Methods to combat this increase the complexity of design. We have developed a novel neuromorphic system called a Trainable Analogue Block (TAB), which exploits device mismatch as a means for random projections of the input to a higher dimensional space. The TAB framework is inspired by the principles of neural population coding operating in the biological nervous system. Three neuronal layers, namely input, hidden, and output, constitute the TAB framework, with the number of hidden layer neurons far exceeding the input layer neurons. Here, we present measurement results of the first prototype TAB chip built using a 65nm process technology and show its learning capability for various regression tasks. Our TAB chip exploits inherent randomness and variability arising due to the fabrication process to perform various learning tasks. Additionally, we characterise each neuron and discuss the statistical variability of its tuning curve that arises due to random device mismatch, a desirable property for the learning capability of the TAB. We also discuss the effect of the number of hidden neurons and the resolution of output weights on the accuracy of the learning capability of the TAB.

NEMay 11, 2015
An Online Learning Algorithm for Neuromorphic Hardware Implementation

Chetan Singh Thakur, Runchun Wang, Saeed Afshar et al.

We propose a sign-based online learning (SOL) algorithm for a neuromorphic hardware framework called Trainable Analogue Block (TAB). The TAB framework utilises the principles of neural population coding, implying that it encodes the input stimulus using a large pool of nonlinear neurons. The SOL algorithm is a simple weight update rule that employs the sign of the hidden layer activation and the sign of the output error, which is the difference between the target output and the predicted output. The SOL algorithm is easily implementable in hardware, and can be used in any artificial neural network framework that learns weights by minimising a convex cost function. We show that the TAB framework can be trained for various regression tasks using the SOL algorithm.

NEMar 2, 2015
A neuromorphic hardware framework based on population coding

Chetan Singh Thakur, Tara Julia Hamilton, Runchun Wang et al.

In the biological nervous system, large neuronal populations work collaboratively to encode sensory stimuli. These neuronal populations are characterised by a diverse distribution of tuning curves, ensuring that the entire range of input stimuli is encoded. Based on these principles, we have designed a neuromorphic system called a Trainable Analogue Block (TAB), which encodes given input stimuli using a large population of neurons with a heterogeneous tuning curve profile. Heterogeneity of tuning curves is achieved using random device mismatches in VLSI (Very Large Scale Integration) process and by adding a systematic offset to each hidden neuron. Here, we present measurement results of a single test cell fabricated in a 65nm technology to verify the TAB framework. We have mimicked a large population of neurons by re-using measurement results from the test cell by varying offset. We thus demonstrate the learning capability of the system for various regression tasks. The TAB system may pave the way to improve the design of analogue circuits for commercial applications, by rendering circuits insensitive to random mismatch that arises due to the manufacturing process.

NEMar 2, 2015
FPGA Implementation of the CAR Model of the Cochlea

Chetan Singh Thakur, Tara Julia Hamilton, Jonathan Tapson et al.

The front end of the human auditory system, the cochlea, converts sound signals from the outside world into neural impulses transmitted along the auditory pathway for further processing. The cochlea senses and separates sound in a nonlinear active fashion, exhibiting remarkable sensitivity and frequency discrimination. Although several electronic models of the cochlea have been proposed and implemented, none of these are able to reproduce all the characteristics of the cochlea, including large dynamic range, large gain and sharp tuning at low sound levels, and low gain and broad tuning at intense sound levels. Here, we implement the Cascade of Asymmetric Resonators (CAR) model of the cochlea on an FPGA. CAR represents the basilar membrane filter in the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. CAR-FAC is a neuromorphic model of hearing based on a pole-zero filter cascade model of auditory filtering. It uses simple nonlinear extensions of conventional digital filter stages that are well suited to FPGA implementations, so that we are able to implement up to 1224 cochlear sections on Virtex-6 FPGA to process sound data in real time. The FPGA implementation of the electronic cochlea described here may be used as a front-end sound analyser for various machine-hearing applications.

NEDec 29, 2014
Fast, simple and accurate handwritten digit classification by training shallow neural network classifiers with the 'extreme learning machine' algorithm

Mark D. McDonnell, Migel D. Tissera, Tony Vladusich et al.

Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM) approach, which also enables a very rapid training time (~10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random `receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

NENov 11, 2014
Turn Down that Noise: Synaptic Encoding of Afferent SNR in a Single Spiking Neuron

Saeed Afshar, Libin George, Jonathan Tapson et al.

We have added a simplified neuromorphic model of Spike Time Dependent Plasticity (STDP) to the Synapto-dendritic Kernel Adapting Neuron (SKAN). The resulting neuron model is the first to show synaptic encoding of afferent signal to noise ratio in addition to the unsupervised learning of spatio temporal spike patterns. The neuron model is particularly suitable for implementation in digital neuromorphic hardware as it does not use any complex mathematical operations and uses a novel approach to achieve synaptic homeostasis. The neurons noise compensation properties are characterized and tested on noise corrupted zeros digits of the MNIST handwritten dataset. Results show the simultaneously learning common patterns in its input data while dynamically weighing individual afferent channels based on their signal to noise ratio. Despite its simplicity the interesting behaviors of the neuron model and the resulting computational power may offer insights into biological systems.

NEAug 6, 2014
Racing to Learn: Statistical Inference and Learning in a Single Spiking Neuron with Adaptive Kernels

Saeed Afshar, Libin George, Jonathan Tapson et al.

This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively hiding its learnt pattern from its neighbors. The robustness to noise, high speed and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA).

NEJun 12, 2014
Learning ELM network weights using linear discriminant analysis

Philip de Chazal, Jonathan Tapson, André van Schaik

We present an alternative to the pseudo-inverse method for determining the hidden to output weight values for Extreme Learning Machines performing classification tasks. The method is based on linear discriminant analysis and provides Bayes optimal single point estimates for the weight values.

NEJun 11, 2014
Explicit Computation of Input Weights in Extreme Learning Machines

Jonathan Tapson, Philip de Chazal, André van Schaik

We present a closed form expression for initializing the input weights in a multi-layer perceptron, which can be used as the first step in synthesis of an Extreme Learning Ma-chine. The expression is based on the standard function for a separating hyperplane as computed in multilayer perceptrons and linear Support Vector Machines; that is, as a linear combination of input data samples. In the absence of supervised training for the input weights, random linear combinations of training data samples are used to project the input data to a higher dimensional hidden layer. The hidden layer weights are solved in the standard ELM fashion by computing the pseudoinverse of the hidden layer outputs and multiplying by the desired output values. All weights for this method can be computed in a single pass, and the resulting networks are more accurate and more consistent on some standard problems than regular ELM networks of the same size.

NEMay 30, 2014
ELM Solutions for Event-Based Systems

Jonathan Tapson, André van Schaik

Whilst most engineered systems use signals that are continuous in time, there is a domain of systems in which signals consist of events. Events, like Dirac delta functions, have no meaningful time duration. Many important real-world systems are intrinsically event-based, including the mammalian brain, in which the primary packets of data are spike events, or action potentials. In this domain, signal processing requires responses to spatio-temporal patterns of events. We show that some straightforward modifications to the standard ELM topology produce networks that are able to perform spatio-temporal event processing online with a high degree of accuracy. The modifications involve the re-definition of hidden layer units as synaptic kernels, in which the input delta functions are transformed into continuous-valued signals using a variety of impulse-response functions. This permits the use of linear solution methods in the output layer, which can produce events as output, if modeled as a classifier; the output classes are 'event' or 'no event'. We illustrate the method in application to a spike-processing problem.

NEMay 30, 2014
Online and Adaptive Pseudoinverse Solutions for ELM Weights

André van Schaik, Jonathan Tapson

The ELM method has become widely used for classification and regressions problems as a result of its accuracy, simplicity and ease of use. The solution of the hidden layer weights by means of a matrix pseudoinverse operation is a significant contributor to the utility of the method; however, the conventional calculation of the pseudoinverse by means of a singular value decomposition (SVD) is not always practical for large data sets or for online updates to the solution. In this paper we discuss incremental methods for solving the pseudoinverse which are suitable for ELM. We show that careful choice of methods allows us to optimize for accuracy, ease of computation, or adaptability of the solution.

NEJun 13, 2013
The Ripple Pond: Enabling Spiking Networks to See

Saeed Afshar, Gregory Cohen, Runchun Wang et al.

In this paper we present the biologically inspired Ripple Pond Network (RPN), a simply connected spiking neural network that, operating together with recently proposed PolyChronous Networks (PCN), enables rapid, unsupervised, scale and rotation invariant object recognition using efficient spatio-temporal spike coding. The RPN has been developed as a hardware solution linking previously implemented neuromorphic vision and memory structures capable of delivering end-to-end high-speed, low-power and low-resolution recognition for mobile and autonomous applications where slow, highly sophisticated and power hungry signal processing solutions are ineffective. Key aspects in the proposed approach include utilising the spatial properties of physically embedded neural networks and propagating waves of activity therein for information processing, using dimensional collapse of imagery information into amenable temporal patterns and the use of asynchronous frames for information binding.

NEJul 13, 2012
Learning the Pseudoinverse Solution to Network Weights

Jonathan Tapson, Andre van Schaik

The last decade has seen the parallel emergence in computational neuroscience and machine learning of neural network structures which spread the input signal randomly to a higher dimensional space; perform a nonlinear activation; and then solve for a regression or classification output by means of a mathematical pseudoinverse operation. In the field of neuromorphic engineering, these methods are increasingly popular for synthesizing biologically plausible neural networks, but the "learning method" - computation of the pseudoinverse by singular value decomposition - is problematic both for biological plausibility and because it is not an online or an adaptive method. We present an online or incremental method of computing the pseudoinverse, which we argue is biologically plausible as a learning method, and which can be made adaptable for non-stationary data streams. The method is significantly more memory-efficient than the conventional computation of pseudoinverses by singular value decomposition.