CRAIDCLGMay 20, 2024

StatAvg: Mitigating Data Heterogeneity in Federated Learning for Intrusion Detection Systems

arXiv:2405.13062v112 citationsh-index: 87IEEE Trans Netw Serv Manag
Originality Incremental advance
AI Analysis

This addresses a specific challenge in cybersecurity for building more reliable federated intrusion detection systems, but it is incremental as it builds on existing FL methods.

The paper tackles data heterogeneity in federated learning for intrusion detection systems by proposing StatAvg, a method that aggregates and shares global statistics for data normalization, showing improved mitigation of non-iid feature distributions compared to baselines.

Federated learning (FL) is a decentralized learning technique that enables participating devices to collaboratively build a shared Machine Leaning (ML) or Deep Learning (DL) model without revealing their raw data to a third party. Due to its privacy-preserving nature, FL has sparked widespread attention for building Intrusion Detection Systems (IDS) within the realm of cybersecurity. However, the data heterogeneity across participating domains and entities presents significant challenges for the reliable implementation of an FL-based IDS. In this paper, we propose an effective method called Statistical Averaging (StatAvg) to alleviate non-independently and identically (non-iid) distributed features across local clients' data in FL. In particular, StatAvg allows the FL clients to share their individual data statistics with the server, which then aggregates this information to produce global statistics. The latter are shared with the clients and used for universal data normalisation. It is worth mentioning that StatAvg can seamlessly integrate with any FL aggregation strategy, as it occurs before the actual FL training process. The proposed method is evaluated against baseline approaches using datasets for network and host Artificial Intelligence (AI)-powered IDS. The experimental results demonstrate the efficiency of StatAvg in mitigating non-iid feature distributions across the FL clients compared to the baseline methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes