A Crowdsensing Intrusion Detection Dataset For Decentralized Federated Learning Models
This work addresses security challenges in IoT crowdsensing environments, providing a dataset for studying decentralized federated learning, but it is incremental as it builds on existing federated learning methods.
The paper tackles the problem of malware detection in IoT crowdsensing by introducing a dataset and comparing decentralized federated learning (DFL) with traditional ML and centralized federated learning (CFL), showing that DFL maintains competitive performance while preserving data locality and outperforms CFL in most settings.
This paper introduces a dataset and an experimental study on Decentralized Federated Learning (DFL) for Internet of Things (IoT) crowdsensing malware detection. The dataset comprises behavioral records from benign and eight malware attacks. A total of 21,582,484 original records were collected from system calls, file system activities, resource usage, kernel events, input/output events, and network records. These records were aggregated into 30-second windows, resulting in 342,106 data records used for model training and evaluation. Experiments on the DFL platform compare traditional Machine Learning (ML), Centralized Federated Learning (CFL), and DFL across different node counts, topologies, and data distributions. Results show that DFL maintains competitive performance while preserving data locality, outperforming CFL in most settings. This dataset provides a solid foundation for studying the security of IoT crowdsensing environments.