Danilo Pau

LG
h-index19
9papers
642citations
Novelty39%
AI Score37

9 Papers

LGApr 25, 2023
Improving Robustness Against Adversarial Attacks with Deeply Quantized Neural Networks

Ferheen Ayaz, Idris Zakariyya, José Cano et al.

Reducing the memory footprint of Machine Learning (ML) models, particularly Deep Neural Networks (DNNs), is essential to enable their deployment into resource-constrained tiny devices. However, a disadvantage of DNN models is their vulnerability to adversarial attacks, as they can be fooled by adding slight perturbations to the inputs. Therefore, the challenge is how to create accurate, robust, and tiny DNN models deployable on resource-constrained embedded devices. This paper reports the results of devising a tiny DNN model, robust to adversarial black and white box attacks, trained with an automatic quantizationaware training framework, i.e. QKeras, with deep quantization loss accounted in the learning loop, thereby making the designed DNNs more accurate for deployment on tiny devices. We investigated how QKeras and an adversarial robustness technique, Jacobian Regularization (JR), can provide a co-optimization strategy by exploiting the DNN topology and the per layer JR approach to produce robust yet tiny deeply quantized DNN models. As a result, a new DNN model implementing this cooptimization strategy was conceived, developed and tested on three datasets containing both images and audio inputs, as well as compared its performance with existing benchmarks against various white-box and black-box attacks. Experimental results demonstrated that on average our proposed DNN model resulted in 8.3% and 79.5% higher accuracy than MLCommons/Tiny benchmarks in the presence of white-box and black-box attacks on the CIFAR-10 image dataset and a subset of the Google Speech Commands audio dataset respectively. It was also 6.5% more accurate for black-box attacks on the SVHN image dataset.

LGDec 17, 2025
OASI: Objective-Aware Surrogate Initialization for Multi-Objective Bayesian Optimization in TinyML Keyword Spotting

Soumen Garai, Danilo Pau, Suman Samui

Voice-triggered interfaces rely on keyword spotting (KWS) models that must operate continuously under strict memory, latency, and energy constraints on microcontroller-class hardware. Designing such models therefore requires not only high recognition accuracy but also predictable deployability within limited Flash and SRAM budgets. Bayesian optimization is known to handle accuracy-efficiency trade-offs effectively in multi-objective optimization; however, it is highly sensitive to initialization, particularly in the low-budget regimes of TinyML model optimization. We propose Objective-Aware Surrogate Initialization (OASI), which seeds surrogate optimization with Pareto-biased solutions generated via multi-objective simulated annealing. Unlike space-filling or heuristic warm-start methods, OASI initializes the surrogate conditioning process with a bias toward feasible accuracy-memory trade-offs, thus avoiding SRAM-violating configurations. OASI improves hypervolume and convergence robustness over Latin hypercube, Sobol, and random initializations under the same budget constraints on a TinyML KWS problem. Hardware-in-the-loop experiments on STM32 microcontrollers verify the existence of deployable and memory-feasible models without incurring extra optimization costs.

LGMar 12, 2025
Quantitative Analysis of Deeply Quantized Tiny Neural Networks Robust to Adversarial Attacks

Idris Zakariyya, Ferheen Ayaz, Mounia Kharbouche-Harrari et al.

Reducing the memory footprint of Machine Learning (ML) models, especially Deep Neural Networks (DNNs), is imperative to facilitate their deployment on resource-constrained edge devices. However, a notable drawback of DNN models lies in their susceptibility to adversarial attacks, wherein minor input perturbations can deceive them. A primary challenge revolves around the development of accurate, resilient, and compact DNN models suitable for deployment on resource-constrained edge devices. This paper presents the outcomes of a compact DNN model that exhibits resilience against both black-box and white-box adversarial attacks. This work has achieved this resilience through training with the QKeras quantization-aware training framework. The study explores the potential of QKeras and an adversarial robustness technique, Jacobian Regularization (JR), to co-optimize the DNN architecture through per-layer JR methodology. As a result, this paper has devised a DNN model employing this co-optimization strategy based on Stochastic Ternary Quantization (STQ). Its performance was compared against existing DNN models in the face of various white-box and black-box attacks. The experimental findings revealed that, the proposed DNN model had small footprint and on average, it exhibited better performance than Quanos and DS-CNN MLCommons/TinyML (MLC/T) benchmarks when challenged with white-box and black-box attacks, respectively, on the CIFAR-10 image and Google Speech Commands audio datasets.

LGFeb 1, 2025
Enhancing Field-Oriented Control of Electric Drives with Tiny Neural Network Optimized for Micro-controllers

Martin Joel Mouk Elele, Danilo Pau, Shixin Zhuang et al.

The deployment of neural networks on resource-constrained micro-controllers has gained momentum, driving many advancements in Tiny Neural Networks. This paper introduces a tiny feed-forward neural network, TinyFC, integrated into the Field-Oriented Control (FOC) of Permanent Magnet Synchronous Motors (PMSMs). Proportional-Integral (PI) controllers are widely used in FOC for their simplicity, although their limitations in handling nonlinear dynamics hinder precision. To address this issue, a lightweight 1,400 parameters TinyFC was devised to enhance the FOC performance while fitting into the computational and memory constraints of a micro-controller. Advanced optimization techniques, including pruning, hyperparameter tuning, and quantization to 8-bit integers, were applied to reduce the model's footprint while preserving the network effectiveness. Simulation results show the proposed approach significantly reduced overshoot by up to 87.5%, with the pruned model achieving complete overshoot elimination, highlighting the potential of tiny neural networks in real-time motor control applications.

LGJun 14, 2021
MLPerf Tiny Benchmark

Colby Banbury, Vijay Janapa Reddi, Peter Torelli et al.

Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The benchmark suite is the collaborative effort of more than 50 organizations from industry and academia and reflects the needs of the community. MLPerf Tiny measures the accuracy, latency, and energy of machine learning inference to properly evaluate the tradeoffs between systems. Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible manner. The suite features four benchmarks: keyword spotting, visual wake words, image classification, and anomaly detection.

LGFeb 27, 2021
Characterization of Neural Networks Automatically Mapped on Automotive-grade Microcontrollers

Giulia Crocioni, Giambattista Gruosso, Danilo Pau et al.

Nowadays, Neural Networks represent a major expectation for the realization of powerful Deep Learning algorithms, which can determine several physical systems' behaviors and operations. Computational resources required for model, training, and running are large, especially when related to the amount of data that Neural Networks typically need to generalize. The latest TinyML technologies allow integrating pre-trained models on embedded systems, allowing making computing at the edge faster, cheaper, and safer. Although these technologies originated in the consumer and industrial worlds, many sectors can greatly benefit from them, such as the automotive industry. In this paper, we present a framework for implementing Neural Network-based models on a family of automotive Microcontrollers, showing their efficiency in two case studies applied to vehicles: intrusion detection on the Controller Area Network bus and residual capacity estimation in Lithium-Ion batteries, widely used in Electric Vehicles.

PFMar 10, 2020
Benchmarking TinyML Systems: Challenges and Direction

Colby R. Banbury, Vijay Janapa Reddi, Max Lam et al.

Recent advancements in ultra-low-power machine learning (TinyML) hardware promises to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted benchmark for these systems. Benchmarking allows us to measure and thereby systematically compare, evaluate, and improve the performance of systems and is therefore fundamental to a field reaching maturity. In this position paper, we present the current landscape of TinyML and discuss the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads. Furthermore, we present our four benchmarks and discuss our selection methodology. Our viewpoints reflect the collective thoughts of the TinyMLPerf working group that is comprised of over 30 organizations.

CVSep 8, 2016
Reduced Memory Region Based Deep Convolutional Neural Network Detection

Denis Tome', Luca Bondi, Emanuele Plebani et al.

Accurate pedestrian detection has a primary role in automotive safety: for example, by issuing warnings to the driver or acting actively on car's brakes, it helps decreasing the probability of injuries and human fatalities. In order to achieve very high accuracy, recent pedestrian detectors have been based on Convolutional Neural Networks (CNN). Unfortunately, such approaches require vast amounts of computational power and memory, preventing efficient implementations on embedded systems. This work proposes a CNN-based detector, adapting a general-purpose convolutional network to the task at hand. By thoroughly analyzing and optimizing each step of the detection pipeline, we develop an architecture that outperforms methods based on traditional image features and achieves an accuracy close to the state-of-the-art while having low computational complexity. Furthermore, the model is compressed in order to fit the tight constrains of low power devices with a limited amount of embedded memory available. This paper makes two main contributions: (1) it proves that a region based deep neural network can be finely tuned to achieve adequate accuracy for pedestrian detection (2) it achieves a very low memory usage without reducing detection accuracy on the Caltech Pedestrian dataset.

DCMar 28, 2015
A Multi-signal Variant for the GPU-based Parallelization of Growing Self-Organizing Networks

Giacomo Parigi, Angelo Stramieri, Danilo Pau et al.

Among the many possible approaches for the parallelization of self-organizing networks, and in particular of growing self-organizing networks, perhaps the most common one is producing an optimized, parallel implementation of the standard sequential algorithms reported in the literature. In this paper we explore an alternative approach, based on a new algorithm variant specifically designed to match the features of the large-scale, fine-grained parallelism of GPUs, in which multiple input signals are processed at once. Comparative tests have been performed, using both parallel and sequential implementations of the new algorithm variant, in particular for a growing self-organizing network that reconstructs surfaces from point clouds. The experimental results show that this approach allows harnessing in a more effective way the intrinsic parallelism that the self-organizing networks algorithms seem intuitively to suggest, obtaining better performances even with networks of smaller size.