CR LGJan 23, 2019

Deep Adversarial Learning in Intrusion Detection: A Data Augmentation Enhanced Framework

He Zhang, Xingrui Yu, Peng Ren, Chunbo Luo, Geyong Min

arXiv:1901.07949v313.056 citations

Originality Incremental advance

AI Analysis

This work addresses data scarcity and imbalance in intrusion detection systems, which is a domain-specific problem for cybersecurity, but it appears incremental as it builds on existing learning-based methods.

The paper tackled the problem of detecting network intrusions with limited and imbalanced training data by proposing a framework that combines deep adversarial learning with statistical methods for data augmentation, resulting in improved accuracy, precision, recall, and F1-score on the KDD Cup 99 dataset.

Intrusion detection systems (IDSs) play an important role in identifying malicious attacks and threats in networking systems. As fundamental tools of IDSs, learning based classification methods have been widely employed. When it comes to detecting network intrusions in small sample sizes (e.g., emerging intrusions), the limited number and imbalanced proportion of training samples usually cause significant challenges in training supervised and semi-supervised classifiers. In this paper, we propose a general network intrusion detection framework to address the challenges of both \emph{data scarcity} and \emph{data imbalance}. The novelty of the proposed framework focuses on incorporating deep adversarial learning with statistical learning and exploiting learning based data augmentation. Given a small set of network intrusion samples, it first derives a Poisson-Gamma joint probabilistic generative model to generate synthesised intrusion data using Monte Carlo methods. Those synthesised data are then augmented by deep generative neural networks through adversarial learning. Finally, it adopts the augmented intrusion data to train supervised models for detecting network intrusions. Comprehensive experimental validations on KDD Cup 99 dataset show that the proposed framework outperforms the existing learning based IDSs in terms of improved accuracy, precision, recall, and F1-score.

View on arXiv PDF

Similar