LGNov 15, 2022
Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and ChallengesEnrique Tomás Martínez Beltrán, Mario Quiles Pérez, Pedro Miguel Sánchez Sánchez et al.
In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustworthiness concerns affecting the entity responsible for the global model creation. Decentralized Federated Learning (DFL) emerged to address these concerns by promoting decentralized model aggregation and minimizing reliance on centralized architectures. However, despite the work done in DFL, the literature has not (i) studied the main aspects differentiating DFL and CFL; (ii) analyzed DFL frameworks to create and evaluate new solutions; and (iii) reviewed application scenarios using DFL. Thus, this article identifies and analyzes the main fundamentals of DFL in terms of federation architectures, topologies, communication mechanisms, security approaches, and key performance indicators. Additionally, the paper at hand explores existing mechanisms to optimize critical DFL fundamentals. Then, the most relevant features of the current DFL frameworks are reviewed and compared. After that, it analyzes the most used DFL application scenarios, identifying solutions based on the fundamentals and frameworks previously defined. Finally, the evolution of existing DFL solutions is studied to provide a list of trends, lessons learned, and open challenges.
CRFeb 20, 2023
FederatedTrust: A Solution for Trustworthy Federated LearningPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Ning Xie et al.
The rapid expansion of the Internet of Things (IoT) and Edge Computing has presented challenges for centralized Machine and Deep Learning (ML/DL) methods due to the presence of distributed data silos that hold sensitive information. To address concerns regarding data privacy, collaborative and privacy-preserving ML/DL techniques like Federated Learning (FL) have emerged. However, ensuring data privacy and performance alone is insufficient since there is a growing need to establish trust in model predictions. Existing literature has proposed various approaches on trustworthy ML/DL (excluding data privacy), identifying robustness, fairness, explainability, and accountability as important pillars. Nevertheless, further research is required to identify trustworthiness pillars and evaluation metrics specifically relevant to FL models, as well as to develop solutions that can compute the trustworthiness level of FL models. This work examines the existing requirements for evaluating trustworthiness in FL and introduces a comprehensive taxonomy consisting of six pillars (privacy, robustness, fairness, explainability, accountability, and federation), along with over 30 metrics for computing the trustworthiness of FL models. Subsequently, an algorithm named FederatedTrust is designed based on the pillars and metrics identified in the taxonomy to compute the trustworthiness score of FL models. A prototype of FederatedTrust is implemented and integrated into the learning process of FederatedScope, a well-established FL framework. Finally, five experiments are conducted using different configurations of FederatedScope to demonstrate the utility of FederatedTrust in computing the trustworthiness of FL models. Three experiments employ the FEMNIST dataset, and two utilize the N-BaIoT dataset considering a real-world IoT security use case.
LGJun 16, 2023
Fedstellar: A Platform for Decentralized Federated LearningEnrique Tomás Martínez Beltrán, Ángel Luis Perales Gómez, Chao Feng et al.
In 2016, Google proposed Federated Learning (FL) as a novel paradigm to train Machine Learning (ML) models across the participants of a federation while preserving data privacy. Since its birth, Centralized FL (CFL) has been the most used approach, where a central entity aggregates participants' models to create a global one. However, CFL presents limitations such as communication bottlenecks, single point of failure, and reliance on a central server. Decentralized Federated Learning (DFL) addresses these issues by enabling decentralized model aggregation and minimizing dependency on a central entity. Despite these advances, current platforms training DFL models struggle with key issues such as managing heterogeneous federation network topologies. To overcome these challenges, this paper presents Fedstellar, a platform extended from p2pfl library and designed to train FL models in a decentralized, semi-decentralized, and centralized fashion across diverse federations of physical or virtualized devices. The Fedstellar implementation encompasses a web application with an interactive graphical interface, a controller for deploying federations of nodes using physical or virtual devices, and a core deployed on each device which provides the logic needed to train, aggregate, and communicate in the network. The effectiveness of the platform has been demonstrated in two scenarios: a physical deployment involving single-board devices such as Raspberry Pis for detecting cyberattacks, and a virtualized deployment comparing various FL approaches in a controlled environment using MNIST and CIFAR-10 datasets. In both scenarios, Fedstellar demonstrated consistent performance and adaptability, achieving F1 scores of 91%, 98%, and 91.2% using DFL for detecting cyberattacks and classifying MNIST and CIFAR-10, respectively, reducing training time by 32% compared to centralized approaches.
CRDec 30, 2022
RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoTAlberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Jan von der Assen et al.
Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM.
LGOct 20, 2022
Analyzing the Robustness of Decentralized Horizontal and Vertical Federated Learning Architectures in a Non-IID ScenarioPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Enrique Tomás Martínez Beltrán et al.
Federated learning (FL) allows participants to collaboratively train machine and deep learning models while protecting data privacy. However, the FL paradigm still presents drawbacks affecting its trustworthiness since malicious participants could launch adversarial attacks against the training process. Related work has studied the robustness of horizontal FL scenarios under different attacks. However, there is a lack of work evaluating the robustness of decentralized vertical FL and comparing it with horizontal FL architectures affected by adversarial attacks. Thus, this work proposes three decentralized FL architectures, one for horizontal and two for vertical scenarios, namely HoriChain, VertiChain, and VertiComb. These architectures present different neural networks and training protocols suitable for horizontal and vertical scenarios. Then, a decentralized, privacy-preserving, and federated use case with non-IID data to classify handwritten digits is deployed to evaluate the performance of the three architectures. Finally, a set of experiments computes and compares the robustness of the proposed architectures when they are affected by different data poisoning based on image watermarks and gradient poisoning adversarial attacks. The experiments show that even though particular configurations of both attacks can destroy the classification performance of the architectures, HoriChain is the most robust one.
CRJun 27, 2023
RansomAI: AI-powered Ransomware for Stealthy EncryptionJan von der Assen, Alberto Huertas Celdrán, Janik Luechinger et al.
Cybersecurity solutions have shown promising performance when detecting ransomware samples that use fixed algorithms and encryption rates. However, due to the current explosion of Artificial Intelligence (AI), sooner than later, ransomware (and malware in general) will incorporate AI techniques to intelligently and dynamically adapt its encryption behavior to be undetected. It might result in ineffective and obsolete cybersecurity solutions, but the literature lacks AI-powered ransomware to verify it. Thus, this work proposes RansomAI, a Reinforcement Learning-based framework that can be integrated into existing ransomware samples to adapt their encryption behavior and stay stealthy while encrypting files. RansomAI presents an agent that learns the best encryption algorithm, rate, and duration that minimizes its detection (using a reward mechanism and a fingerprinting intelligent detection system) while maximizing its damage function. The proposed framework was validated in a ransomware, Ransomware-PoC, that infected a Raspberry Pi 4, acting as a crowdsensor. A pool of experiments with Deep Q-Learning and Isolation Forest (deployed on the agent and detection system, respectively) has demonstrated that RansomAI evades the detection of Ransomware-PoC affecting the Raspberry Pi 4 in a few minutes with >90% accuracy.
CROct 14, 2022
A Lightweight Moving Target Defense Framework for Multi-purpose Malware Affecting IoT DevicesJan von der Assen, Alberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez et al.
Malware affecting Internet of Things (IoT) devices is rapidly growing due to the relevance of this paradigm in real-world scenarios. Specialized literature has also detected a trend towards multi-purpose malware able to execute different malicious actions such as remote control, data leakage, encryption, or code hiding, among others. Protecting IoT devices against this kind of malware is challenging due to their well-known vulnerabilities and limitation in terms of CPU, memory, and storage. To improve it, the moving target defense (MTD) paradigm was proposed a decade ago and has shown promising results, but there is a lack of IoT MTD solutions dealing with multi-purpose malware. Thus, this work proposes four MTD mechanisms changing IoT devices' network, data, and runtime environment to mitigate multi-purpose malware. Furthermore, it presents a lightweight and IoT-oriented MTD framework to decide what, when, and how the MTD mechanisms are deployed. Finally, the efficiency and effectiveness of the framework and MTD mechanisms are evaluated in a real-world scenario with one IoT spectrum sensor affected by multi-purpose malware.
CRDec 30, 2022
Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identificationPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet et al.
In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance.
SPSep 8, 2022
Studying Drowsiness Detection Performance while Driving through Scalable Machine Learning Models using ElectroencephalographyJosé Manuel Hidalgo Rogel, Enrique Tomás Martínez Beltrán, Mario Quiles Pérez et al.
- Background / Introduction: Driver drowsiness is a significant concern and one of the leading causes of traffic accidents. Advances in cognitive neuroscience and computer science have enabled the detection of drivers' drowsiness using Brain-Computer Interfaces (BCIs) and Machine Learning (ML). However, the literature lacks a comprehensive evaluation of drowsiness detection performance using a heterogeneous set of ML algorithms, and it is necessary to study the performance of scalable ML models suitable for groups of subjects. - Methods: To address these limitations, this work presents an intelligent framework employing BCIs and features based on electroencephalography for detecting drowsiness in driving scenarios. The SEED-VIG dataset is used to evaluate the best-performing models for individual subjects and groups. - Results: Results show that Random Forest (RF) outperformed other models used in the literature, such as Support Vector Machine (SVM), with a 78% f1-score for individual models. Regarding scalable models, RF reached a 79% f1-score, demonstrating the effectiveness of these approaches. This publication highlights the relevance of exploring a diverse set of ML algorithms and scalable approaches suitable for groups of subjects to improve drowsiness detection systems and ultimately reduce the number of accidents caused by driver fatigue. - Conclusions: The lessons learned from this study show that not only SVM but also other models not sufficiently explored in the literature are relevant for drowsiness detection. Additionally, scalable approaches are effective in detecting drowsiness, even when new subjects are evaluated. Thus, the proposed framework presents a novel approach for detecting drowsiness in driving scenarios using BCIs and ML.
CRJul 21, 2023
Mitigating Communications Threats in Decentralized Federated Learning through Moving Target DefenseEnrique Tomás Martínez Beltrán, Pedro Miguel Sánchez Sánchez, Sergio López Bernal et al.
The rise of Decentralized Federated Learning (DFL) has enabled the training of machine learning models across federated participants, fostering decentralized model aggregation and reducing dependence on a server. However, this approach introduces unique communication security challenges that have yet to be thoroughly addressed in the literature. These challenges primarily originate from the decentralized nature of the aggregation process, the varied roles and responsibilities of the participants, and the absence of a central authority to oversee and mitigate threats. Addressing these challenges, this paper first delineates a comprehensive threat model focused on DFL communications. In response to these identified risks, this work introduces a security module to counter communication-based attacks for DFL platforms. The module combines security techniques such as symmetric and asymmetric encryption with Moving Target Defense (MTD) techniques, including random neighbor selection and IP/port switching. The security module is implemented in a DFL platform, Fedstellar, allowing the deployment and monitoring of the federation. A DFL scenario with physical and virtual deployments have been executed, encompassing three security configurations: (i) a baseline without security, (ii) an encrypted configuration, and (iii) a configuration integrating both encryption and MTD techniques. The effectiveness of the security module is validated through experiments with the MNIST dataset and eclipse attacks. The results showed an average F1 score of 95%, with the most secure configuration resulting in CPU usage peaking at 68% (+-9%) in virtual deployments and network traffic reaching 480.8 MB (+-18 MB), effectively mitigating risks associated with eavesdropping or eclipse attacks.
CRJan 21
Dynamic Management of a Deep Learning-Based Anomaly Detection System for 5G NetworksLorenzo Fernández Maimó, Alberto Huertas Celdrán, Manuel Gil Pérez et al.
Fog and mobile edge computing (MEC) will play a key role in the upcoming fifth generation (5G) mobile networks to support decentralized applications, data analytics and management into the network itself by using a highly distributed compute model. Furthermore, increasing attention is paid to providing user-centric cybersecurity solutions, which particularly require collecting, processing and analyzing significantly large amount of data traffic and huge number of network connections in 5G networks. In this regard, this paper proposes a MEC-oriented solution in 5G mobile networks to detect network anomalies in real-time and in autonomic way. Our proposal uses deep learning techniques to analyze network flows and to detect network anomalies. Moreover, it uses policies in order to provide an efficient and dynamic management system of the computing resources used in the anomaly detection process. The paper presents relevant aspects of the deployment of the proposal and experimental results to show its performance.
CRFeb 15Code
Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language ModelsMario Marín Caballero, Miguel Betancourt Alonso, Daniel Díaz-López et al.
The most valuable asset of any cloud-based organization is data, which is increasingly exposed to sophisticated cyberattacks. Until recently, the implementation of security measures in DevOps environments was often considered optional by many government entities and critical national services operating in the cloud. This includes systems managing sensitive information, such as electoral processes or military operations, which have historically been valuable targets for cybercriminals. Resistance to security implementation is often driven by concerns over losing agility in software development, increasing the risk of accumulated vulnerabilities. Nowadays, patching software is no longer enough; adopting a proactive cyber defense strategy, supported by Artificial Intelligence (AI), is crucial to anticipating and mitigating threats. Thus, this work proposes integrating the Security Chaos Engineering (SCE) methodology with a new LLM-based flow to automate the creation of attack defense trees that represent adversary behavior and facilitate the construction of SCE experiments based on these graphical models, enabling teams to stay one step ahead of attackers and implement previously unconsidered defenses. Further detailed information about the experiment performed, along with the steps to replicate it, can be found in the following repository: https://github.com/mariomc14/devsecops-adversary-llm.git.
CRMay 15, 2024
Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System CallsPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Gérôme Bovet et al.
In the current cybersecurity landscape, protecting military devices such as communication and battlefield management systems against sophisticated cyber attacks is crucial. Malware exploits vulnerabilities through stealth methods, often evading traditional detection mechanisms such as software signatures. The application of ML/DL in vulnerability detection has been extensively explored in the literature. However, current ML/DL vulnerability detection methods struggle with understanding the context and intent behind complex attacks. Integrating large language models (LLMs) with system call analysis offers a promising approach to enhance malware detection. This work presents a novel framework leveraging LLMs to classify malware based on system call data. The framework uses transfer learning to adapt pre-trained LLMs for malware detection. By retraining LLMs on a dataset of benign and malicious system calls, the models are refined to detect signs of malware activity. Experiments with a dataset of over 1TB of system calls demonstrate that models with larger context sizes, such as BigBird and Longformer, achieve superior accuracy and F1-Score of approximately 0.86. The results highlight the importance of context size in improving detection rates and underscore the trade-offs between computational complexity and performance. This approach shows significant potential for real-time detection in high-stakes environments, offering a robust solution to evolving cyber threats.
LGJan 31, 2025
S-VOTE: Similarity-based Voting for Client Selection in Decentralized Federated LearningPedro Miguel Sánchez Sánchez, Enrique Tomás Martínez Beltrán, Chao Feng et al.
Decentralized Federated Learning (DFL) enables collaborative, privacy-preserving model training without relying on a central server. This decentralized approach reduces bottlenecks and eliminates single points of failure, enhancing scalability and resilience. However, DFL also introduces challenges such as suboptimal models with non-IID data distributions, increased communication overhead, and resource usage. Thus, this work proposes S-VOTE, a voting-based client selection mechanism that optimizes resource usage and enhances model performance in federations with non-IID data conditions. S-VOTE considers an adaptive strategy for spontaneous local training that addresses participation imbalance, allowing underutilized clients to contribute without significantly increasing resource costs. Extensive experiments on benchmark datasets demonstrate the S-VOTE effectiveness. More in detail, it achieves lower communication costs by up to 21%, 4-6% faster convergence, and improves local performance by 9-17% compared to baseline methods in some configurations, all while achieving a 14-24% energy consumption reduction. These results highlight the potential of S-VOTE to address DFL challenges in heterogeneous environments.
LGDec 15, 2024
ProFe: Communication-Efficient Decentralized Federated Learning via Distillation and PrototypesPedro Miguel Sánchez Sánchez, Enrique Tomás Martínez Beltrán, Miguel Fernández Llamas et al.
Decentralized Federated Learning (DFL) trains models in a collaborative and privacy-preserving manner while removing model centralization risks and improving communication bottlenecks. However, DFL faces challenges in efficient communication management and model aggregation within decentralized environments, especially with heterogeneous data distributions. Thus, this paper introduces ProFe, a novel communication optimization algorithm for DFL that combines knowledge distillation, prototype learning, and quantization techniques. ProFe utilizes knowledge from large local models to train smaller ones for aggregation, incorporates prototypes to better learn unseen classes, and applies quantization to reduce data transmitted during communication rounds. The performance of ProFe has been validated and compared to the literature by using benchmark datasets like MNIST, CIFAR10, and CIFAR100. Results showed that the proposed algorithm reduces communication costs by up to ~40-50% while maintaining or improving model performance. In addition, it adds ~20% training time due to increased complexity, generating a trade-off.
CRAug 5, 2025
Simulating Cyberattacks through a Breach Attack Simulation (BAS) Platform empowered by Security Chaos Engineering (SCE)Arturo Sánchez-Matas, Pablo Escribano Ruiz, Daniel Díaz-López et al.
In today digital landscape, organizations face constantly evolving cyber threats, making it essential to discover slippery attack vectors through novel techniques like Security Chaos Engineering (SCE), which allows teams to test defenses and identify vulnerabilities effectively. This paper proposes to integrate SCE into Breach Attack Simulation (BAS) platforms, leveraging adversary profiles and abilities from existing threat intelligence databases. This innovative proposal for cyberattack simulation employs a structured architecture composed of three layers: SCE Orchestrator, Connector, and BAS layers. Utilizing MITRE Caldera in the BAS layer, our proposal executes automated attack sequences, creating inferred attack trees from adversary profiles. Our proposal evaluation illustrates how integrating SCE with BAS can enhance the effectiveness of attack simulations beyond traditional scenarios, and be a useful component of a cyber defense strategy.
CRJan 31, 2022
Studying the Robustness of Anti-adversarial Federated Learning Models Detecting Cyberattacks in IoT Spectrum SensorsPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Timo Schenk et al.
Device fingerprinting combined with Machine and Deep Learning (ML/DL) report promising performance when detecting cyberattacks targeting data managed by resource-constrained spectrum sensors. However, the amount of data needed to train models and the privacy concerns of such scenarios limit the applicability of centralized ML/DL-based approaches. Federated learning (FL) addresses these limitations by creating federated and privacy-preserving models. However, FL is vulnerable to malicious participants, and the impact of adversarial attacks on federated models detecting spectrum sensing data falsification (SSDF) attacks on spectrum sensors has not been studied. To address this challenge, the first contribution of this work is the creation of a novel dataset suitable for FL and modeling the behavior (usage of CPU, memory, or file system, among others) of resource-constrained spectrum sensors affected by different SSDF attacks. The second contribution is a pool of experiments analyzing and comparing the robustness of federated models according to i) three families of spectrum sensors, ii) eight SSDF attacks, iii) four scenarios dealing with unsupervised (anomaly detection) and supervised (binary classification) federated models, iv) up to 33% of malicious participants implementing data and model poisoning attacks, and v) four aggregation functions acting as anti-adversarial mechanisms to increase the models robustness.
CRJan 14, 2022
CyberSpec: Intelligent Behavioral Fingerprinting to Detect Attacks on Crowdsensing Spectrum SensorsAlberto Huertas Celdrán, Pedro Miguel Sánchez Sánchez, Gérôme Bovet et al.
Integrated sensing and communication (ISAC) is a novel paradigm using crowdsensing spectrum sensors to help with the management of spectrum scarcity. However, well-known vulnerabilities of resource-constrained spectrum sensors and the possibility of being manipulated by users with physical access complicate their protection against spectrum sensing data falsification (SSDF) attacks. Most recent literature suggests using behavioral fingerprinting and Machine/Deep Learning (ML/DL) for improving similar cybersecurity issues. Nevertheless, the applicability of these techniques in resource-constrained devices, the impact of attacks affecting spectrum data integrity, and the performance and scalability of models suitable for heterogeneous sensors types are still open challenges. To improve limitations, this work presents seven SSDF attacks affecting spectrum sensors and introduces CyberSpec, an ML/DL-oriented framework using device behavioral fingerprinting to detect anomalies produced by SSDF attacks affecting resource-constrained spectrum sensors. CyberSpec has been implemented and validated in ElectroSense, a real crowdsensing RF monitoring platform where several configurations of the proposed SSDF attacks have been executed in different sensors. A pool of experiments with different unsupervised ML/DL-based models has demonstrated the suitability of CyberSpec detecting the previous attacks within an acceptable timeframe.
CRNov 29, 2021
Robust Federated Learning for execution time-based device model identification under label-flipping attackPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, José Rafael Buendía Rubio et al.
The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flipping attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%.
CRJun 15, 2021
A methodology to identify identical single-board computers based on hardware behavior fingerprintingPedro Miguel Sánchez Sánchez, José María Jorquera Valero, Alberto Huertas Celdrán et al.
The connectivity and resource-constrained nature of single-board devices open the door to cybersecurity concerns affecting Internet of Things (IoT) scenarios. One of the most important issues is the presence of unauthorized IoT devices that want to impersonate legitimate ones by using identical hardware and software specifications. This situation can provoke sensitive information leakages, data poisoning, or privilege escalation in IoT scenarios. Combining behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques is a promising approach to identify these malicious spoofing devices by detecting minor performance differences generated by imperfections in manufacturing. However, existing solutions are not suitable for single-board devices since they do not consider their hardware and software limitations, underestimate critical aspects such as fingerprint stability or context changes, and do not explore the potential of ML/DL techniques. To improve it, this work first identifies the essential properties for single-board device identification: uniqueness, stability, diversity, scalability, efficiency, robustness, and security. Then, a novel methodology relies on behavioral fingerprinting to identify identical single-board devices and meet the previous properties. The methodology leverages the different built-in components of the system and ML/DL techniques, comparing the device internal behavior with each other to detect manufacturing variations. The methodology validation has been performed in a real environment composed of 15 identical Raspberry Pi 4 B and 10 Raspberry Pi 3 B+ devices, obtaining a 91.9% average TPR and identifying all devices by setting a 50% threshold in the evaluation process. Finally, a discussion compares the proposed solution with related work, highlighting the fingerprint properties not met, and provides important lessons learned and limitations.
CRJun 9, 2021
Eight Reasons Why Cybersecurity on Novel Generations of Brain-Computer Interfaces Must Be PrioritizedSergio López Bernal, Alberto Huertas Celdrán, Gregorio Martínez Pérez
This article presents eight neural cyberattacks affecting spontaneous neural activity, inspired by well-known cyberattacks from the computer science domain: Neural Flooding, Neural Jamming, Neural Scanning, Neural Selective Forwarding, Neural Spoofing, Neural Sybil, Neural Sinkhole and Neural Nonce. These cyberattacks are based on the exploitation of vulnerabilities existing in the new generation of Brain-Computer Interfaces. After presenting their formal definitions, the cyberattacks have been implemented over a neuronal simulation. To evaluate the impact of each cyberattack, they have been implemented in a Convolutional Neural Network (CNN) simulating a portion of a mouse's visual cortex. This implementation is based on existing literature indicating the similarities that CNNs have with neuronal structures from the visual cortex. Some conclusions are also provided, indicating that Neural Nonce and Neural Jamming are the most impactful cyberattacks for short-term effects, while Neural Scanning and Neural Nonce are the most damaging for long-term effects.
CRMay 23, 2021
Neuronal Jamming Cyberattack over Invasive BCI Affecting the Resolution of Tasks Requiring Visual CapabilitiesSergio López Bernal, Alberto Huertas Celdrán, Gregorio Martínez Pérez
Invasive Brain-Computer Interfaces (BCI) are extensively used in medical application scenarios to record, stimulate, or inhibit neural activity with different purposes. An example is the stimulation of some brain areas to reduce the effects generated by Parkinson's disease. Despite the advances in recent years, cybersecurity on BCI is an open challenge since attackers can exploit the vulnerabilities of invasive BCIs to induce malicious stimulation or treatment disruption, affecting neuronal activity. In this work, we design and implement a novel neuronal cyberattack, called Neuronal Jamming (JAM), which prevents neurons from producing spikes. To implement and measure the JAM impact, and due to the lack of realistic neuronal topologies in mammalians, we have defined a use case with a Convolutional Neural Network (CNN) trained to allow a mouse to exit a particular maze. The resulting model has been translated to a neural topology, simulating a portion of a mouse's visual cortex. The impact of JAM on both biological and artificial networks is measured, analyzing how the attacks can both disrupt the spontaneous neural signaling and the mouse's capacity to exit the maze. Besides, we compare the impacts of both JAM and FLO (an existing neural cyberattack) demonstrating that JAM generates a higher impact in terms of neuronal spike rate. Finally, we discuss on whether and how JAM and FLO attacks could induce the effects of neurodegenerative diseases if the implanted BCI had a comprehensive electrode coverage of the targeted brain regions.
CRAug 7, 2020
A Survey on Device Behavior Fingerprinting: Data Sources, Techniques, Application Scenarios, and DatasetsPedro Miguel Sánchez Sánchez, Jose María Jorquera Valero, Alberto Huertas Celdrán et al.
In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., sensors, actuators) and detect potential misbehavior that may arise due to cyberattacks, system faults, or misconfigurations. With this goal in mind, a promising research field emerged focusing on creating and managing fingerprints that model the behavior of both the device actions and its components. The article at hand studies the recent growth of the device behavior fingerprinting field in terms of application scenarios, behavioral sources, and processing and evaluation techniques. First, it performs a comprehensive review of the device types, behavioral data, and processing and evaluation techniques used by the most recent and representative research works dealing with two major scenarios: device identification and device misbehavior detection. After that, each work is deeply analyzed and compared, emphasizing its characteristics, advantages, and limitations. This article also provides researchers with a review of the most relevant characteristics of existing datasets as most of the novel processing techniques are based on machine learning and deep learning. Finally, it studies the evolution of these two scenarios in recent years, providing lessons learned, current trends, and future research challenges to guide new solutions in the area.
NCJul 18, 2020
Cyberattacks on Miniature Brain Implants to Disrupt Spontaneous Neural SignalingSergio López Bernal, Alberto Huertas Celdrán, Lorenzo Fernández Maimó et al.
Brain-Computer Interfaces (BCI) arose as systems that merge computing systems with the human brain to facilitate recording, stimulation, and inhibition of neural activity. Over the years, the development of BCI technologies has shifted towards miniaturization of devices that can be seamlessly embedded into the brain and can target single neuron or small population sensing and control. We present a motivating example highlighting vulnerabilities of two promising micron-scale BCI technologies, demonstrating the lack of security and privacy principles in existing solutions. This situation opens the door to a novel family of cyberattacks, called neuronal cyberattacks, affecting neuronal signaling. This paper defines the first two neural cyberattacks, Neuronal Flooding (FLO) and Neuronal Scanning (SCA), where each threat can affect the natural activity of neurons. This work implements these attacks in a neuronal simulator to determine their impact over the spontaneous neuronal behavior, defining three metrics: number of spikes, percentage of shifts, and dispersion of spikes. Several experiments demonstrate that both cyberattacks produce a reduction of spikes compared to spontaneous behavior, generating a rise in temporal shifts and a dispersion increase. Mainly, SCA presents a higher impact than FLO in the metrics focused on the number of spikes and dispersion, where FLO is slightly more damaging, considering the percentage of shifts. Nevertheless, the intrinsic behavior of each attack generates a differentiation on how they alter neuronal signaling. FLO is adequate to generate an immediate impact on the neuronal activity, whereas SCA presents higher effectiveness for damages to the neural signaling in the long-term.
CRApr 16, 2020
AuthCODE: A Privacy-preserving and Multi-device Continuous Authentication Architecture based on Machine and Deep LearningPedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, Lorenzo Fernández Maimó et al.
The authentication field is evolving towards mechanisms able to keep users continuously authenticated without the necessity of remembering or possessing authentication credentials. While existing continuous authentication systems have demonstrated their suitability for single-device scenarios, the Internet of Things and next generation of mobile networks (5G) are enabling novel multi-device scenarios -- such as Smart Offices -- where continuous authentication is still an open challenge. The paper at hand, proposes an AI-based, privacy-preserving and multi-device continuous authentication architecture called AuthCODE. A realistic Smart Office scenario with several users, interacting with their mobile devices and personal computer, has been used to create a set of single- and multi-device behavioural datasets and validate AuthCODE. A pool of experiments with machine and deep learning classifiers measured the impact of time in authentication accuracy and improved the results of single-device approaches by considering multi-device behaviour profiles. The f1-score average reached for XGBoost on multi-device profiles based on 1-minute windows was 99.33%, while the best performance achieved for single devices was lower than 97.39%. The inclusion of temporal information in the form of vector sequences classified by a Long-Short Term Memory Network, allowed the identification of additional complex behaviour patterns associated to each user, resulting in an average f1-score of 99.02% on identification of long-term behaviours.
CRAug 9, 2019
Security in Brain-Computer Interfaces: State-of-the-art, opportunities, and future challengesSergio López Bernal, Alberto Huertas Celdrán, Gregorio Martínez Pérez et al.
BCIs have significantly improved the patients' quality of life by restoring damaged hearing, sight, and movement capabilities. After evolving their application scenarios, the current trend of BCI is to enable new innovative brain-to-brain and brain-to-the-Internet communication paradigms. This technological advancement generates opportunities for attackers since users' personal information and physical integrity could be under tremendous risk. This work presents the existing versions of the BCI life-cycle and homogenizes them in a new approach that overcomes current limitations. After that, we offer a qualitative characterization of the security attacks affecting each phase of the BCI cycle to analyze their impacts and countermeasures documented in the literature. Finally, we reflect on lessons learned, highlighting research trends and future challenges concerning security on BCIs.
CRMay 30, 2014
ROMEO: ReputatiOn Model Enhancing OpenID SimulatorGinés Dólera Tormo, Félix Gómez Mármol, Gregorio Martínez Pérez
OpenID is a standard decentralized initiative aimed at allowing Internet users to use the same personal account to access different services. Since it does not rely on any central authority, it is hard for such users or other entities to validate the trust level of each entity deployed in the system. Some research has been conducted to handle this issue, defining a reputation framework to determine the trust level of a relying party based on past experiences. However, this framework has been proposed in a theoretical way and some deeper analysis and validation is still missing. Our main contribution in this paper consist of a simulation environment able to validate the feasibility of the reputation framework and analyze its behaviour within different scenarios.