Emmanuel Baccelli

LG
h-index9
9papers
38citations
Novelty43%
AI Score52

9 Papers

LGJun 26, 2023Code
U-TOE: Universal TinyML On-board Evaluation Toolkit for Low-Power IoT

Zhaolan Huang, Koen Zandberg, Kaspar Schleiser et al.

Results from the TinyML community demonstrate that, it is possible to execute machine learning models directly on the terminals themselves, even if these are small microcontroller-based devices. However, to date, practitioners in the domain lack convenient all-in-one toolkits to help them evaluate the feasibility of executing arbitrary models on arbitrary low-power IoT hardware. To this effect, we present in this paper U-TOE, a universal toolkit we designed to facilitate the task of IoT designers and researchers, by combining functionalities from a low-power embedded OS, a generic model transpiler and compiler, an integrated performance measurement module, and an open-access remote IoT testbed. We provide an open source implementation of U-TOE and we demonstrate its use to experimentally evaluate the performance of various models, on a wide variety of low-power IoT boards, based on popular microcontroller architectures. U-TOE allows easily reproducible and customizable comparative evaluation experiments on a wide variety of IoT hardware all-at-once. The availability of a toolkit such as U-TOE is desirable to accelerate research combining Artificial Intelligence and IoT towards fully exploiting the potential of edge computing.

LGJul 31, 2024
TinyChirp: Bird Song Recognition Using TinyML Models on Low-power Wireless Acoustic Sensors

Zhaolan Huang, Adrien Tousnakhoff, Polina Kozyr et al.

Monitoring biodiversity at scale is challenging. Detecting and identifying species in fine grained taxonomies requires highly accurate machine learning (ML) methods. Training such models requires large high quality data sets. And deploying these models to low power devices requires novel compression techniques and model architectures. While species classification methods have profited from novel data sets and advances in ML methods, in particular neural networks, deploying these state of the art models to low power devices remains difficult. Here we present a comprehensive empirical comparison of various tinyML neural network architectures and compression techniques for species classification. We focus on the example of bird song detection, more concretely a data set curated for studying the corn bunting bird species. The data set is released along with all code and experiments of this study. In our experiments we compare predictive performance, memory and time complexity of classical spectrogram based methods and recent approaches operating on raw audio signal. Our results indicate that individual bird species can be robustly detected with relatively simple architectures that can be readily deployed to low power devices.

LGDec 10, 2025Code
Ariel-ML: Computing Parallelization with Embedded Rust for Neural Networks on Heterogeneous Multi-core Microcontrollers

Zhaolan Huang, Kaspar Schleiser, Gyungmin Myung et al.

Low-power microcontroller (MCU) hardware is currently evolving from single-core architectures to predominantly multi-core architectures. In parallel, new embedded software building blocks are more and more written in Rust, while C/C++ dominance fades in this domain. On the other hand, small artificial neural networks (ANN) of various kinds are increasingly deployed in edge AI use cases, thus deployed and executed directly on low-power MCUs. In this context, both incremental improvements and novel innovative services will have to be continuously retrofitted using ANNs execution in software embedded on sensing/actuating systems already deployed in the field. However, there was so far no Rust embedded software platform automating parallelization for inference computation on multi-core MCUs executing arbitrary TinyML models. This paper thus fills this gap by introducing Ariel-ML, a novel toolkit we designed combining a generic TinyML pipeline and an embedded Rust software platform which can take full advantage of multi-core capabilities of various 32bit microcontroller families (Arm Cortex-M, RISC-V, ESP-32). We published the full open source code of its implementation, which we used to benchmark its capabilities using a zoo of various TinyML models. We show that Ariel-ML outperforms prior art in terms of inference latency as expected, and we show that, compared to pre-existing toolkits using embedded C/C++, Ariel-ML achieves comparable memory footprints. Ariel-ML thus provides a useful basis for TinyML practitioners and resource-constrained embedded Rust developers.

LGDec 10, 2025Code
TinyDéjàVu: Smaller Memory Footprint & Faster Inference on Sensor Data Streams with Always-On Microcontrollers

Zhaolan Huang, Emmanuel Baccelli

Always-on sensors are increasingly expected to embark a variety of tiny neural networks and to continuously perform inference on time-series of the data they sense. In order to fit lifetime and energy consumption requirements when operating on battery, such hardware uses microcontrollers (MCUs) with tiny memory budget e.g., 128kB of RAM. In this context, optimizing data flows across neural network layers becomes crucial. In this paper, we introduce TinyDéjàVu, a new framework and novel algorithms we designed to drastically reduce the RAM footprint required by inference using various tiny ML models for sensor data time-series on typical microcontroller hardware. We publish the implementation of TinyDéjàVu as open source, and we perform reproducible benchmarks on hardware. We show that TinyDéjàVu can save more than 60% of RAM usage and eliminate up to 90% of redundant compute on overlapping sliding window inputs.

32.1OSApr 30Code
treVM: Tiny Rust Embedded Virtual Machines with WASM on Variable Resource-Constrained Hardware

Antoine Lavandier, Bastien Buil, Chrystel Gaber et al.

Software stacks embedded on microcontroller-based hardware typically provide rudimentary APIs programmed in C/C++, basic connectivity and, sometimes, a firmware update mechanism. Such coarse mechanisms contrast with widely used APIs and more advanced networked interaction expected from software stacks deployed on less resource-constrained hardware (microprocessor-based). In this paper, we aim to bridge this gap by designing treVM, a generic scheme to host high-level WebAssembly code capsules, bolted on a general-purpose Rust embedded software platform, able to run on a large variety of 32-bit microcontrollers. Not only can treVM capsules host highly customizable business logic, but capsules can also be securely updated on demand over the network, on devices already deployed in the field. We implement treVM in Rust, on top of Ariel OS, a general-purpose RTOS, and we publish the code as open source. Based on our implementation, we validate the feasibility of treVM on commonly available boards, and we report on extensive benchmarks we performed on heterogeneous hardware including Arm Cortex-M, RISC-V, and Xtensa microcontroller architectures. As such, treVM provides a promising new framework to secure continuous deployment of embedded software on low-power networked devices.

CRJun 10, 2021Code
Quantum-Resistant Security for Software Updates on Low-power Networked Embedded Devices

Gustavo Banegas, Koen Zandberg, Adrian Herrmann et al.

As the Internet of Things (IoT) rolls out today to devices whose lifetime may well exceed a decade, conservative threat models should consider attackers with access to quantum computing power. The SUIT standard (specified by the IETF) defines a security architecture for IoT software updates, standardizing the metadata and the cryptographic tools-namely, digital signatures and hash functions-that guarantee the legitimacy of software updates. While the performance of SUIT has previously been evaluated in the pre-quantum context, it has not yet been studied in a post-quantum context. Taking the open-source implementation of SUIT available in RIOT as a case study, we overview post-quantum considerations, and quantum-resistant digital signatures in particular, focusing on lowpower, microcontroller-based IoT devices which have stringent resource constraints in terms of memory, CPU, and energy consumption. We benchmark a selection of proposed post-quantum signature schemes (LMS, Falcon, and Dilithium) and compare them with current pre-quantum signature schemes (Ed25519 and ECDSA). Our benchmarks are carried out on a variety of IoT hardware including ARM Cortex-M, RISC-V, and Espressif (ESP32), which form the bulk of modern 32-bit microcontroller architectures. We interpret our benchmark results in the context of SUIT, and estimate the real-world impact of post-quantum alternatives for a range of typical software update categories. CCS CONCEPTS $\bullet$ Computer systems organization $\rightarrow$ Embedded systems.

28.3OSApr 28
Embedded Rust or C Firmware? Lessons from an Industrial Microcontroller Use Case with Ariel OS

Bipin Thapa, Daniele Alfonso, Lorenzo Bini et al.

As Rust gains traction for developing safer systems software, a reality check for the microcontroller hardware segment becomes necessary. How ready is the Rust ecosystem for this segment? Can Rust compete with C in practice? This paper reports on an IoT industrial case study that contributes to answering these questions. Two teams concurrently developing the same functionality (one in C, one in Rust) are analyzed over a period of several months. A comparative analysis of their approaches, results, and iterative efforts is provided. The analysis and measurements on hardware indicate no strong reason to prefer C over Rust for microcontroller firmware on the basis of memory footprint or execution speed. Furthermore, Ariel OS is shown to provide an efficient and portable system runtime in Rust whose footprint is smaller than that of the state-of-the-art bare-metal C stack traditionally used in this context. It is concluded that Rust is a sound choice today for firmware development in this domain.

SEJun 10, 2021
Femto-Containers: DevOps on Microcontrollers with Lightweight Virtualization & Isolation for IoT Software Modules

Koen Zandberg, Emmanuel Baccelli

Development, deployment and maintenance of networked software has been revolutionized by DevOps, which have become essential to boost system software quality and to enable agile evolution. Meanwhile the Internet of Things (IoT) connects more and more devices which are not covered by DevOps tools: low-power, microcontroller-based devices. In this paper, we contribute to bridge this gap by designing Femto-Containers, a new architecture which enables containerization, virtualization and secure deployment of software modules embedded on microcontrollers over low-power networks. As proof-of-concept, we implemented and evaluated Femto-Containers on popular microcontroller architectures (Arm Cortex-M, ESP32 and RISC-V), using eBPF virtualization, and RIOT, a common operating system in this space. We show that Femto-Containers can virtualize and isolate multiple software modules, executed concurrently, with very small memory footprint overhead (below 10%) and very small startup time (tens of microseconds) compared to native code execution. We show that Femto-Containers can satisfy the constraints of both low-level debug logic inserted in a hot code path, and high-level business logic coded in a variety of common programming languages. Compared to prior work, Femto-Containers thus offer an attractive trade-off in terms of memory footprint, energy consumption, agility and security.

CRNov 24, 2020
Low-Power IoT Communication Security: On the Performance of DTLS and TLS 1.3

Gabriele Restuccia, Hannes Tschofenig, Emmanuel Baccelli

Similarly to elsewhere on the Internet, practical security in the Internet of Things (IoT) is achieved by combining an array of mechanisms, at work at all layers of the protocol stack, in system software, and in hardware. Standard protocols such as Datagram Transport Layer Security (DTLS 1.2) and Transport Layer Security (TLS 1.2) are often recommended to secure communications to/from IoT devices. Recently, the TLS 1.3 standard was released and DTLS 1.3 is in the final stages of standardization. In this paper, we give an overview of version 1.3 of these protocols, and we provide the first experimental comparative performance analysis of different implementations and various configurations of these protocols, on real IoT devices based on low-power microcontrollers. We show how different implementations lead to different compromises. We measure and compare bytes-over-the-air, memory footprint, and energy consumption. We show that, when DTLS/TLS 1.3 requires more resources than DTLS/TLS 1.2, this additional overhead is quite reasonable. We also observe that, in some configurations, DTLS/TLS 1.3 actually decreases overhead and resource consumption. All in all, our study indicates that there is still room to optimize the existing implementations of these protocols.