LG DCMay 16, 2024

Federated Learning for Misbehaviour Detection with Variational Autoencoders and Gaussian Mixture Models

Enrique Mármol Campos, Aurora González Vidal, José Luis Hernández Ramos, Antonio Skarmeta

arXiv:2405.09903v16.42 citationsh-index: 56Has CodeInt J Inf Secur

Originality Incremental advance

AI Analysis

This addresses the need for privacy-preserving and efficient misbehavior detection in vehicles, though it is incremental as it combines existing techniques in a federated setting.

The paper tackles the problem of detecting unknown cyberattacks in vehicular networks by proposing an unsupervised federated learning approach using Variational Autoencoders and Gaussian Mixture Models, achieving over 80% performance compared to supervised methods.

Federated Learning (FL) has become an attractive approach to collaboratively train Machine Learning (ML) models while data sources' privacy is still preserved. However, most of existing FL approaches are based on supervised techniques, which could require resource-intensive activities and human intervention to obtain labelled datasets. Furthermore, in the scope of cyberattack detection, such techniques are not able to identify previously unknown threats. In this direction, this work proposes a novel unsupervised FL approach for the identification of potential misbehavior in vehicular environments. We leverage the computing capabilities of public cloud services for model aggregation purposes, and also as a central repository of misbehavior events, enabling cross-vehicle learning and collective defense strategies. Our solution integrates the use of Gaussian Mixture Models (GMM) and Variational Autoencoders (VAE) on the VeReMi dataset in a federated environment, where each vehicle is intended to train only with its own data. Furthermore, we use Restricted Boltzmann Machines (RBM) for pre-training purposes, and Fedplus as aggregation function to enhance model's convergence. Our approach provides better performance (more than 80 percent) compared to recent proposals, which are usually based on supervised techniques and artificial divisions of the VeReMi dataset.

View on arXiv PDF Code

Similar