CROct 6, 2021
PWG-IDS: An Intrusion Detection Model for Solving Class Imbalance in IIoT Networks Using Generative Adversarial NetworksLei Zhang, Shuaimin Jiang, Xiajiong Shen et al.
With the continuous development of industrial IoT (IIoT) technology, network security is becoming more and more important. And intrusion detection is an important part of its security. However, since the amount of attack traffic is very small compared to normal traffic, this imbalance makes intrusion detection in it very difficult. To address this imbalance, an intrusion detection system called pretraining Wasserstein generative adversarial network intrusion detection system (PWG-IDS) is proposed in this paper. This system is divided into two main modules: 1) In this module, we introduce the pretraining mechanism in the Wasserstein generative adversarial network with gradient penalty (WGAN-GP) for the first time, firstly using the normal network traffic to train the WGAN-GP, and then inputting the imbalance data into the pre-trained WGAN-GP to retrain and generate the final required data. 2) Intrusion detection module: We use LightGBM as the classification algorithm to detect attack traffic in IIoT networks. The experimental results show that our proposed PWG-IDS outperforms other models, with F1-scores of 99% and 89% on the 2 datasets, respectively. And the pretraining mechanism we proposed can also be widely used in other GANs, providing a new way of thinking for the training of GANs.
LGAug 30, 2020
A Survey of Deep Active LearningPengzhen Ren, Yun Xiao, Xiaojun Chang et al.
Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples. Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters, so that the model learns how to extract high-quality features. In recent years, due to the rapid development of internet technology, we are in an era of information torrents and we have massive amounts of data. In this way, DL has aroused strong interest of researchers and has been rapidly developed. Compared with DL, researchers have relatively low interest in AL. This is mainly because before the rise of DL, traditional machine learning requires relatively few labeled samples. Therefore, early AL is difficult to reflect the value it deserves. Although DL has made breakthroughs in various fields, most of this success is due to the publicity of the large number of existing annotation datasets. However, the acquisition of a large number of high-quality annotated datasets consumes a lot of manpower, which is not allowed in some fields that require high expertise, especially in the fields of speech recognition, information extraction, medical images, etc. Therefore, AL has gradually received due attention. A natural idea is whether AL can be used to reduce the cost of sample annotations, while retaining the powerful learning capabilities of DL. Therefore, deep active learning (DAL) has emerged. Although the related research has been quite abundant, it lacks a comprehensive survey of DAL. This article is to fill this gap, we provide a formal classification method for the existing work, and a comprehensive and systematic overview. In addition, we also analyzed and summarized the development of DAL from the perspective of application. Finally, we discussed the confusion and problems in DAL, and gave some possible development directions for DAL.
CRMay 9, 2020
Cloud-based Federated Boosting for Mobile CrowdsensingZhuzhu Wang, Yilong Yang, Yang Liu et al.
The application of federated extreme gradient boosting to mobile crowdsensing apps brings several benefits, in particular high performance on efficiency and classification. However, it also brings a new challenge for data and model privacy protection. Besides it being vulnerable to Generative Adversarial Network (GAN) based user data reconstruction attack, there is not the existing architecture that considers how to preserve model privacy. In this paper, we propose a secret sharing based federated learning architecture FedXGB to achieve the privacy-preserving extreme gradient boosting for mobile crowdsensing. Specifically, we first build a secure classification and regression tree (CART) of XGBoost using secret sharing. Then, we propose a secure prediction protocol to protect the model privacy of XGBoost in mobile crowdsensing. We conduct a comprehensive theoretical analysis and extensive experiments to evaluate the security, effectiveness, and efficiency of FedXGB. The results indicate that FedXGB is secure against the honest-but-curious adversaries and attains less than 1% accuracy loss compared with the original XGBoost model.