Valerio Schiavoni

h-index19

18papers

188citations

Novelty39%

AI Score41

Ranked #64,523 of 194,257 authors (top 33%)#1,502 in CR (top 22%)

18 Papers

5.3LGSep 13, 2023Code

Mitigating Adversarial Attacks in Federated Learning with Trusted Execution Environments

Simon Queyrut, Valerio Schiavoni, Pascal Felber

The main premise of federated learning (FL) is that machine learning model updates are computed locally to preserve user data privacy. This approach avoids by design user data to ever leave the perimeter of their device. Once the updates aggregated, the model is broadcast to all nodes in the federation. However, without proper defenses, compromised nodes can probe the model inside their local memory in search for adversarial examples, which can lead to dangerous real-world scenarios. For instance, in image-based applications, adversarial examples consist of images slightly perturbed to the human eye getting misclassified by the local model. These adversarial images are then later presented to a victim node's counterpart model to replay the attack. Typical examples harness dissemination strategies such as altered traffic signs (patch attacks) no longer recognized by autonomous vehicles or seemingly unaltered samples that poison the local dataset of the FL scheme to undermine its robustness. Pelta is a novel shielding mechanism leveraging Trusted Execution Environments (TEEs) that reduce the ability of attackers to craft adversarial samples. Pelta masks inside the TEE the first part of the back-propagation chain rule, typically exploited by attackers to craft the malicious samples. We evaluate Pelta on state-of-the-art accurate models using three well-established datasets: CIFAR-10, CIFAR-100 and ImageNet. We show the effectiveness of Pelta in mitigating six white-box state-of-the-art adversarial attacks, such as Projected Gradient Descent, Momentum Iterative Method, Auto Projected Gradient Descent, the Carlini & Wagner attack. In particular, Pelta constitutes the first attempt at defending an ensemble model against the Self-Attention Gradient attack to the best of our knowledge. Our code is available to the research community at https://github.com/queyrusi/Pelta.

2.3CESep 5, 2024Code

Practical Forecasting of Cryptocoins Timeseries using Correlation Patterns

Pasquale De Rosa, Pascal Felber, Valerio Schiavoni

Cryptocoins (i.e., Bitcoin, Ether, Litecoin) are tradable digital assets. Ownerships of cryptocoins are registered on distributed ledgers (i.e., blockchains). Secure encryption techniques guarantee the security of the transactions (transfers of coins among owners), registered into the ledger. Cryptocoins are exchanged for specific trading prices. The extreme volatility of such trading prices across all different sets of crypto-assets remains undisputed. However, the relations between the trading prices across different cryptocoins remains largely unexplored. Major coin exchanges indicate trend correlation to advise for sells or buys. However, price correlations remain largely unexplored. We shed some light on the trend correlations across a large variety of cryptocoins, by investigating their coin/price correlation trends over the past two years. We study the causality between the trends, and exploit the derived correlations to understand the accuracy of state-of-the-art forecasting techniques for time series modeling (e.g., GBMs, LSTM and GRU) of correlated cryptocoins. Our evaluation shows (i) strong correlation patterns between the most traded coins (e.g., Bitcoin and Ether) and other types of cryptocurrencies, and (ii) state-of-the-art time series forecasting algorithms can be used to forecast cryptocoins price trends. We released datasets and code to reproduce our analysis to the research community.

6.6ETMar 24

PIM-CACHE: High-Efficiency Content-Aware Copy for Processing-In-Memory

Peterson Yuhala, Mpoki Mwaisela, Pascal Felber et al.

Processing-in-memory (PIM) architectures bring computation closer to data, reducing the processor-memory transfer bottleneck in traditional processor-centric designs. Novel hardware solutions, such as UPMEM's in-memory processing technology, achieve this by integrating low-power DRAM processing units (DPUs) into memory DIMMs, enabling massive parallelism and improved memory bandwidth. However, paradoxically, these PIM architectures introduce mandatory coarse-grained data transfers between host DRAM and DPUs, which often become the new bottleneck. We present PIM-CACHE, a lightweight data staging layer that dynamically eliminates redundant data transfers to PIM DPUs by exploiting workload similarity, achieving content-aware copy (CAC). We evaluate PIM-CACHE on both synthetic workloads and real-world genome datasets, demonstrating its effectiveness in reducing PIM data transfer overhead.

1.2STNov 30, 2022Code

Understanding Cryptocoins Trends Correlations

Pasquale De Rosa, Valerio Schiavoni

Crypto-coins (also known as cryptocurrencies) are tradable digital assets. Notable examples include Bitcoin, Ether and Litecoin. Ownerships of cryptocoins are registered on distributed ledgers (i.e., blockchains). Secure encryption techniques guarantee the security of the transactions (transfers of coins across owners), registered into the ledger. Cryptocoins are exchanged for specific trading prices. While history has shown the extreme volatility of such trading prices across all different sets of crypto-assets, it remains unclear what and if there are tight relations between the trading prices of different cryptocoins. Major coin exchanges (i.e., Coinbase) provide trend correlation indicators to coin owners, suggesting possible acquisitions or sells. However, these correlations remain largely unvalidated. In this paper, we shed lights on the trend correlations across a large variety of cryptocoins, by investigating their coin-price correlation trends over a period of two years. Our experimental results suggest strong correlation patterns between main coins (Ethereum, Bitcoin) and alt-coins. We believe our study can support forecasting techniques for time-series modeling in the context of crypto-coins. We release our dataset and code to reproduce our analysis to the research community.

2.0LGAug 8, 2023

Pelta: Shielding Transformers to Mitigate Evasion Attacks in Federated Learning

Simon Queyrut, Yérom-David Bromberg, Valerio Schiavoni

The main premise of federated learning is that machine learning model updates are computed locally, in particular to preserve user data privacy, as those never leave the perimeter of their device. This mechanism supposes the general model, once aggregated, to be broadcast to collaborating and non malicious nodes. However, without proper defenses, compromised clients can easily probe the model inside their local memory in search of adversarial examples. For instance, considering image-based applications, adversarial examples consist of imperceptibly perturbed images (to the human eye) misclassified by the local model, which can be later presented to a victim node's counterpart model to replicate the attack. To mitigate such malicious probing, we introduce Pelta, a novel shielding mechanism leveraging trusted hardware. By harnessing the capabilities of Trusted Execution Environments (TEEs), Pelta masks part of the back-propagation chain rule, otherwise typically exploited by attackers for the design of malicious samples. We evaluate Pelta on a state of the art ensemble model and demonstrate its effectiveness against the Self Attention Gradient adversarial Attack.

6.6CRApr 27, 2021Code

KEVLAR-TZ: A Secure Cache for ARM TrustZone

Oscar Benedito, Ricard Delgado-Gonzalo, Valerio Schiavoni

Edge devices are increasingly in charge of storing privacy-sensitive data, in particular implantables, wearables, and nearables can potentially collect and process high-resolution vital signs 24/7. Storing and performing computations over such data in a privacy-preserving fashion is of paramount importance. We present KEVLAR-TZ, an application-level trusted cache designed to leverage ARM TrustZone, a popular trusted execution environment available in consumer-grade devices. To facilitate the integration with existing systems and IoT devices and protocols, KEVLAR-TZ exposes a REST-based interface with connection endpoints inside the TrustZone enclave. Furthermore, it exploits the on-device secure persistent storage to guarantee durability of data across reboots. We fully implemented KEVLAR-TZ on top of the OP-TEE framework, and experimentally evaluated its performance. Our results showcase performance trade-offs, for instance in terms of throughput and latency, for various workloads, and we believe our results can be useful for practitioners and in general developers of systems for TrustZone. KEVLAR-TZ is available as open-source at https://github.com/mqttz/kevlar-tz/.

8.8CRDec 11, 2020

TEEMon: A continuous performance monitoring framework for TEEs

Robert Krahn, Donald Dragoti, Franz Gregor et al.

Trusted Execution Environments (TEEs), such as Intel Software Guard eXtensions (SGX), are considered as a promising approach to resolve security challenges in clouds. TEEs protect the confidentiality and integrity of application code and data even against privileged attackers with root and physical access by providing an isolated secure memory area, i.e., enclaves. The security guarantees are provided by the CPU, thus even if system software is compromised, the attacker can never access the enclave's content. While this approach ensures strong security guarantees for applications, it also introduces a considerable runtime overhead in part by the limited availability of protected memory (enclave page cache). Currently, only a limited number of performance measurement tools for TEE-based applications exist and none offer performance monitoring and analysis during runtime. This paper presents TEEMon, the first continuous performance monitoring and analysis tool for TEE-based applications. TEEMon provides not only fine-grained performance metrics during runtime, but also assists the analysis of identifying causes of performance bottlenecks, e.g., excessive system calls. Our approach smoothly integrates with existing open-source tools (e.g., Prometheus or Grafana) towards a holistic monitoring solution, particularly optimized for systems deployed through Docker containers or Kubernetes and offers several dedicated metrics and visualizations. Our evaluation shows that TEEMon's overhead ranges from 5% to 17%.

5.2CRJul 24, 2020Code

MQT-TZ: Hardening IoT Brokers Using ARM TrustZone

Carlos Segarra, Ricard Delgado-Gonzalo, Valerio Schiavoni

The publish-subscribe paradigm is an efficient communication scheme with strong decoupling between the nodes, that is especially fit for large-scale deployments. It adapts natively to very dynamic settings and it is used in a diversity of real-world scenarios, including finance, smart cities, medical environments, or IoT sensors. Several of the mentioned application scenarios require increasingly stringent security guarantees due to the sensitive nature of the exchanged messages as well as the privacy demands of the clients/stakeholders/receivers. MQTT is a lightweight topic-based publish-subscribe protocol popular in edge and IoT settings, a de-facto standard widely adopted nowadays by the industry and researchers. However, MQTT brokers must process data in clear, hence exposing a large attack surface. This paper presents MQT-TZ, a secure MQTT broker leveraging Arm TrustZone, a trusted execution environment (TEE) commonly found even on inexpensive devices largely available on the market (such as Raspberry Pi units). We define a mutual TLS-based handshake and a two-layer encryption for end-to-end security using the TEE as a trusted proxy. The experimental evaluation of our fully implemented prototype with micro-, macro-benchmarks, as well as with real-world industrial workloads from a MedTech use-case, highlights several trade-offs using TrustZone TEE. We report several lessons learned while building and evaluating our system. We release MQT-TZ as open-source.

2.9CRJul 3, 2020

MQT-TZ: Secure MQTT Broker for Biomedical Signal Processing on the Edge

Carlos Segarra, Ricard Delgado-Gonzalo, Valerio Schiavoni

Physical health records belong to healthcare providers, but the information contained within belongs to each patient. In an increasing manner, more health-related data is being acquired by wearables and other IoT devices following the ever-increasing trend of the "Quantified Self". Even though data protection regulations (e.g., GDPR) encourage the usage of privacy-preserving processing techniques, most of the current IoT infrastructure was not originally conceived for such purposes. One of the most used communication protocols, MQTT, is a lightweight publish-subscribe protocol commonly used in the Edge and IoT applications. In MQTT, the broker must process data on clear text, hence exposing a large attack surface for a malicious agent to steal/tamper with this health-related data. In this paper, we introduce MQT-TZ, a secure MQTT broker leveraging Arm TrustZone, a popular Trusted Execution Environment (TEE). We define a mutual TLS-based handshake and a two-layer encryption for end-to-end security using the TEE as a trusted proxy. We provide quantitative evaluation of our open-source PoC on streaming ECGs in real time and highlight the trade-offs.

2.6LGNov 15, 2024

On the Cost of Model-Serving Frameworks: An Experimental Evaluation

Pasquale De Rosa, Yérom-David Bromberg, Pascal Felber et al.

In machine learning (ML), the inference phase is the process of applying pre-trained models to new, unseen data with the objective of making predictions. During the inference phase, end-users interact with ML services to gain insights, recommendations, or actions based on the input data. For this reason, serving strategies are nowadays crucial for deploying and managing models in production environments effectively. These strategies ensure that models are available, scalable, reliable, and performant for real-world applications, such as time series forecasting, image classification, natural language processing, and so on. In this paper, we evaluate the performances of five widely-used model serving frameworks (TensorFlow Serving, TorchServe, MLServer, MLflow, and BentoML) under four different scenarios (malware detection, cryptocoin prices forecasting, image classification, and sentiment analysis). We demonstrate that TensorFlow Serving is able to outperform all the other frameworks in serving deep learning (DL) models. Moreover, we show that DL-specific frameworks (TensorFlow Serving and TorchServe) display significantly lower latencies than the three general-purpose ML frameworks (BentoML, MLFlow, and MLServer).

6.6CRMay 6, 2021

Analysis and Improvement of Heterogeneous Hardware Support in Docker Images

Panagiotis Gkikopoulos, Valerio Schiavoni, Josef Spillner

Docker images are used to distribute and deploy cloud-native applications in containerised form. A container engine runs them with separated privileges according to namespaces. Recent studies have investigated security vulnerabilities and runtime characteristics of Docker images. In contrast, little is known about the extent of hardware-dependent features in them such as processor-specific trusted execution environments, graphics acceleration or extension boards. This problem can be generalised to missing knowledge about the extent of any hardware-bound instructions within the images that may require elevated privileges. We first conduct a systematic one-year evolution analysis of a sample of Docker images concerning their use of hardware-specific features. To improve the state of technology, we contribute novel tools to manage such images. Our heuristic hardware dependency detector and a hardware-aware Docker executor give early warnings upon missing dependencies instead of leading to silent or untimely failures. Our dataset and tools are released to the research community.

12.3CRApr 7, 2021Code

Plinius: Secure and Persistent Machine Learning Model Training

Peterson Yuhala, Pascal Felber, Valerio Schiavoni et al.

With the increasing popularity of cloud based machine learning (ML) techniques there comes a need for privacy and integrity guarantees for ML data. In addition, the significant scalability challenges faced by DRAM coupled with the high access-times of secondary storage represent a huge performance bottleneck for ML systems. While solutions exist to tackle the security aspect, performance remains an issue. Persistent memory (PM) is resilient to power loss (unlike DRAM), provides fast and fine-granular access to memory (unlike disk storage) and has latency and bandwidth close to DRAM (in the order of ns and GB/s, respectively). We present PLINIUS, a ML framework using Intel SGX enclaves for secure training of ML models and PM for fault tolerance guarantees. PLINIUS uses a novel mirroring mechanism to create and maintain (i) encrypted mirror copies of ML models on PM, and (ii) encrypted training data in byte-addressable PM, for near-instantaneous data recovery after a system failure. Compared to disk-based checkpointing systems, PLINIUS is 3.2x and 3.7x faster respectively for saving and restoring models on real PM hardware, achieving robust and secure ML model training in SGX enclaves.

3.0SEJun 2, 2020

Monitoring Data Distribution and Exploitation in a Global-Scale Microservice Artefact Observatory

Panagiotis Gkikopoulos, Josef Spillner, Valerio Schiavoni

Reusable microservice artefacts are often deployed as black or grey boxes, with little concern for their properties and quality, beyond a syntactical interface description. This leads application developers to chaotic and opportunistic assumptions about how a composite application will behave in the real world. Systematically analyzing and tracking these publicly available artefacts will grant much needed predictability to microservice-based deployments. By establishing a distributed observatory and knowledge base, it is possible to track microservice repositories and analyze the artefacts reliably, and provide insights on their properties and quality to developers and researchers alike. This position paper argues for a federated research infrastructure with consensus voting among participants to establish and preserve ground truth about the insights.

16.2CRMar 31, 2020

Trust Management as a Service: Enabling Trusted Execution in the Face of Byzantine Stakeholders

Franz Gregor, Wojciech Ozga, Sébastien Vaucher et al.

Trust is arguably the most important challenge for critical services both deployed as well as accessed remotely over the network. These systems are exposed to a wide diversity of threats, ranging from bugs to exploits, active attacks, rogue operators, or simply careless administrators. To protect such applications, one needs to guarantee that they are properly configured and securely provisioned with the "secrets" (e.g., encryption keys) necessary to preserve not only the confidentiality, integrity and freshness of their data but also their code. Furthermore, these secrets should not be kept under the control of a single stakeholder - which might be compromised and would represent a single point of failure - and they must be protected across software versions in the sense that attackers cannot get access to them via malicious updates. Traditional approaches for solving these challenges often use ad hoc techniques and ultimately rely on a hardware security module (HSM) as root of trust. We propose a more powerful and generic approach to trust management that instead relies on trusted execution environments (TEEs) and a set of stakeholders as root of trust. Our system, PALAEMON, can operate as a managed service deployed in an untrusted environment, i.e., one can delegate its operations to an untrusted cloud provider with the guarantee that data will remain confidential despite not trusting any individual human (even with root access) nor system software. PALAEMON addresses in a secure, efficient and cost-effective way five main challenges faced when developing trusted networked applications and services. Our evaluation on a range of benchmarks and real applications shows that PALAEMON performs efficiently and can protect secrets of services without any change to their source code.

4.9CRJul 29, 2019

Secure Stream Processing for Medical Data

Carlos Segarra, Enric Muntané, Mathieu Lemay et al.

Medical data belongs to whom it produces it. In an increasing manner, this data is usually processed in unauthorized third-party clouds that should never have the opportunity to access it. Moreover, recent data protection regulations (e.g., GDPR) pave the way towards the development of privacy-preserving processing techniques. In this paper, we present a proof of concept of a streaming IoT architecture that securely processes cardiac data in the cloud combining trusted hardware and Spark. The additional security guarantees come with no changes to the application's code in the server. We tested the system with a database containing ECGs from wearable devices comprised of 8 healthy males performing a standarized range of in-lab physisical activities (e.g., run, walk, bike). We show that, when compared with standard Spark Streaming, the addition of privacy comes at the cost of doubling the execution time.

13.0CRJun 17, 2019

Using Trusted Execution Environments for Secure Stream Processing of Medical Data

Carlos Segarra, Ricard Delgado-Gonzalo, Mathieu Lemay et al.

Processing sensitive data, such as those produced by body sensors, on third-party untrusted clouds is particularly challenging without compromising the privacy of the users generating it. Typically, these sensors generate large quantities of continuous data in a streaming fashion. Such vast amount of data must be processed efficiently and securely, even under strong adversarial models. The recent introduction in the mass-market of consumer-grade processors with Trusted Execution Environments (TEEs), such as Intel SGX, paves the way to implement solutions that overcome less flexible approaches, such as those atop homomorphic encryption. We present a secure streaming processing system built on top of Intel SGX to showcase the viability of this approach with a system specifically fitted for medical data. We design and fully implement a prototype system that we evaluate with several realistic datasets. Our experimental results show that the proposed system achieves modest overhead compared to vanilla Spark while offering additional protection guarantees under powerful attackers and threat models.

7.3DCMay 4, 2018Code

SecureStreams: A Reactive Middleware Framework for Secure Data Stream Processing

Aurélien Havet, Rafael Pires, Pascal Felber et al.

The growing adoption of distributed data processing frameworks in a wide diversity of application domains challenges end-to-end integration of properties like security, in particular when considering deployments in the context of large-scale clusters or multi-tenant Cloud infrastructures. This paper therefore introduces SecureStreams, a reactive middleware framework to deploy and process secure streams at scale. Its design combines the high-level reactive dataflow programming paradigm with Intel's low-level software guard extensions (SGX) in order to guarantee privacy and integrity of the processed data. The experimental results of SecureStreams are promising: while offering a fluent scripting language based on Lua, our middleware delivers high processing throughput, thus enabling developers to implement secure processing pipelines in just few lines of code.

7.3DCMay 3, 2018

CYCLOSA: Decentralizing Private Web Search Through SGX-Based Browser Extensions

Rafael Pires, David Goltzsche, Sonia Ben Mokhtar et al.

By regularly querying Web search engines, users (unconsciously) disclose large amounts of their personal data as part of their search queries, among which some might reveal sensitive information (e.g. health issues, sexual, political or religious preferences). Several solutions exist to allow users querying search engines while improving privacy protection. However, these solutions suffer from a number of limitations: some are subject to user re-identification attacks, while others lack scalability or are unable to provide accurate results. This paper presents CYCLOSA, a secure, scalable and accurate private Web search solution. CYCLOSA improves security by relying on trusted execution environments (TEEs) as provided by Intel SGX. Further, CYCLOSA proposes a novel adaptive privacy protection solution that reduces the risk of user re- identification. CYCLOSA sends fake queries to the search engine and dynamically adapts their count according to the sensitivity of the user query. In addition, CYCLOSA meets scalability as it is fully decentralized, spreading the load for distributing fake queries among other nodes. Finally, CYCLOSA achieves accuracy of Web search as it handles the real query and the fake queries separately, in contrast to other existing solutions that mix fake and real query results.