Filippo Moro

NE
h-index24
5papers
34citations
Novelty50%
AI Score32

5 Papers

NEJun 21, 2023
Synaptic metaplasticity with multi-level memristive devices

Simone D'Agostino, Filippo Moro, Tifenn Hirtzlin et al.

Deep learning has made remarkable progress in various tasks, surpassing human performance in some cases. However, one drawback of neural networks is catastrophic forgetting, where a network trained on one task forgets the solution when learning a new one. To address this issue, recent works have proposed solutions based on Binarized Neural Networks (BNNs) incorporating metaplasticity. In this work, we extend this solution to quantized neural networks (QNNs) and present a memristor-based hardware solution for implementing metaplasticity during both inference and training. We propose a hardware architecture that integrates quantized weights in memristor devices programmed in an analog multi-level fashion with a digital processing unit for high-precision metaplastic storage. We validated our approach using a combined software framework and memristor based crossbar array for in-memory computing fabricated in 130 nm CMOS technology. Our experimental results show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST, equal to software baseline. This result demonstrates immunity to catastrophic forgetting and the resilience to analog device imperfections of the proposed solution. Moreover, our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory

NEJul 26, 2024
The Role of Temporal Hierarchy in Spiking Neural Networks

Filippo Moro, Pau Vilimelis Aceituno, Laura Kriener et al.

Spiking Neural Networks (SNNs) have the potential for rich spatio-temporal signal processing thanks to exploiting both spatial and temporal parameters. The temporal dynamics such as time constants of the synapses and neurons and delays have been recently shown to have computational benefits that help reduce the overall number of parameters required in the network and increase the accuracy of the SNNs in solving temporal tasks. Optimizing such temporal parameters, for example, through gradient descent, gives rise to a temporal architecture for different problems. As has been shown in machine learning, to reduce the cost of optimization, architectural biases can be applied, in this case in the temporal domain. Such inductive biases in temporal parameters have been found in neuroscience studies, highlighting a hierarchy of temporal structure and input representation in different layers of the cortex. Motivated by this, we propose to impose a hierarchy of temporal representation in the hidden layers of SNNs, highlighting that such an inductive bias improves their performance. We demonstrate the positive effects of temporal hierarchy in the time constants of feed-forward SNNs applied to temporal tasks (Multi-Time-Scale XOR and Keyword Spotting, with a benefit of up to 4.1% in classification accuracy). Moreover, we show that such architectural biases, i.e. hierarchy of time constants, naturally emerge when optimizing the time constants through gradient descent, initialized as homogeneous values. We further pursue this proposal in temporal convolutional SNNs, by introducing the hierarchical bias in the size and dilation of temporal kernels, giving rise to competitive results in popular temporal spike-based datasets.

LGJun 14, 2025
Quantizing Small-Scale State-Space Models for Edge AI

Leo Zhao, Tristan Torchet, Melika Payvand et al.

State-space models (SSMs) have recently gained attention in deep learning for their ability to efficiently model long-range dependencies, making them promising candidates for edge-AI applications. In this paper, we analyze the effects of quantization on small-scale SSMs with a focus on reducing memory and computational costs while maintaining task performance. Using the S4D architecture, we first investigate post-training quantization (PTQ) and show that the state matrix A and internal state x are particularly sensitive to quantization. Furthermore, we analyze the impact of different quantization techniques applied to the parameters and activations in the S4D architecture. To address the observed performance drop after Post-training Quantization (PTQ), we apply Quantization-aware Training (QAT), significantly improving performance from 40% (PTQ) to 96% on the sequential MNIST benchmark at 8-bit precision. We further demonstrate the potential of QAT in enabling sub-8-bit precisions and evaluate different parameterization schemes for QAT stability. Additionally, we propose a heterogeneous quantization strategy that assigns different precision levels to model components, reducing the overall memory footprint by a factor of 6x without sacrificing performance. Our results provide actionable insights for deploying quantized SSMs in resource-constrained environments.

ARMay 13, 2025
MINIMALIST: switched-capacitor circuits for efficient in-memory computation of gated recurrent units

Sebastian Billaudelle, Laura Kriener, Filippo Moro et al.

Recurrent neural networks (RNNs) have been a long-standing candidate for processing of temporal sequence data, especially in memory-constrained systems that one may find in embedded edge computing environments. Recent advances in training paradigms have now inspired new generations of efficient RNNs. We introduce a streamlined and hardware-compatible architecture based on minimal gated recurrent units (GRUs), and an accompanying efficient mixed-signal hardware implementation of the model. The proposed design leverages switched-capacitor circuits not only for in-memory computation (IMC), but also for the gated state updates. The mixed-signal cores rely solely on commodity circuits consisting of metal capacitors, transmission gates, and a clocked comparator, thus greatly facilitating scaling and transfer to other technology nodes. We benchmark the performance of our architecture on time series data, introducing all constraints required for a direct mapping to the hardware system. The direct compatibility is verified in mixed-signal simulations, reproducing data recorded from the software-only network model.

NEFeb 10, 2022
Hardware calibrated learning to compensate heterogeneity in analog RRAM-based Spiking Neural Networks

Filippo Moro, E. Esmanhotto, T. Hirtzlin et al.

Spiking Neural Networks (SNNs) can unleash the full power of analog Resistive Random Access Memories (RRAMs) based circuits for low power signal processing. Their inherent computational sparsity naturally results in energy efficiency benefits. The main challenge implementing robust SNNs is the intrinsic variability (heterogeneity) of both analog CMOS circuits and RRAM technology. In this work, we assessed the performance and variability of RRAM-based neuromorphic circuits that were designed and fabricated using a 130\,nm technology node. Based on these results, we propose a Neuromorphic Hardware Calibrated (NHC) SNN, where the learning circuits are calibrated on the measured data. We show that by taking into account the measured heterogeneity characteristics in the off-chip learning phase, the NHC SNN self-corrects its hardware non-idealities and learns to solve benchmark tasks with high accuracy. This work demonstrates how to cope with the heterogeneity of neurons and synapses for increasing classification accuracy in temporal tasks.