Flow-based Network Traffic Generation using Generative Adversarial Networks
This work addresses the need for synthetic network traffic data for NIDS evaluation, representing an incremental improvement by adapting existing GAN techniques to a domain-specific challenge.
The paper tackles the problem of generating realistic flow-based network traffic for evaluating network intrusion detection systems by proposing a GAN-based methodology with three preprocessing approaches to handle categorical attributes like IP addresses, and results show that two of the three approaches produce high-quality data.
Flow-based data sets are necessary for evaluating network-based intrusion detection systems (NIDS). In this work, we propose a novel methodology for generating realistic flow-based network traffic. Our approach is based on Generative Adversarial Networks (GANs) which achieve good results for image generation. A major challenge lies in the fact that GANs can only process continuous attributes. However, flow-based data inevitably contain categorical attributes such as IP addresses or port numbers. Therefore, we propose three different preprocessing approaches for flow-based data in order to transform them into continuous values. Further, we present a new method for evaluating the generated flow-based network traffic which uses domain knowledge to define quality tests. We use the three approaches for generating flow-based network traffic based on the CIDDS-001 data set. Experiments indicate that two of the three approaches are able to generate high quality data.