CR LGNov 29, 2021

Robust Federated Learning for execution time-based device model identification under label-flipping attack

Pedro Miguel Sánchez Sánchez, Alberto Huertas Celdrán, José Rafael Buendía Rubio, Gérôme Bovet, Gregorio Martínez Pérez

arXiv:2111.14434v13.8h-index: 34

Originality Incremental advance

AI Analysis

This work addresses cybersecurity threats from device spoofing in IoT/5G scenarios by enabling private, accurate device identification, though it is incremental as it applies existing federated learning methods to a specific domain with attack analysis.

The paper tackled device model identification using execution time data, achieving 0.9999 accuracy in both centralized and federated learning setups without performance loss while preserving data privacy, and evaluated label-flipping attacks in federated learning with Zeno and coordinate-wise median aggregation showing best performance but degrading when malicious clients exceed 50%.

The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flipping attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%.

View on arXiv PDF

Similar