CRNIJun 9, 2020

Towards Generating Benchmark Datasets for Worm Infection Studies

arXiv:2006.05167v62 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a data scarcity problem for researchers in digital forensics, though it is incremental as it builds on an existing tool.

The paper tackles the lack of suitable datasets for evaluating worm infection detection methods by proposing a technique to generate realistic datasets containing both normal and worm traffic, resulting in publicly available datasets for worms like Slammer and Code Red.

Worm origin identification and propagation path reconstruction are among the essential problems in digital forensics. Until now, several methods have been proposed for this purpose. However, evaluating these methods is a big challenge because there are no suitable datasets containing both normal background traffic and worm traffic to evaluate these methods. In this paper, we investigate different methods of generating such datasets and suggest a technique for this purpose. ReaSE is a tool for the creation of realistic simulation environments. However, it needs some modifications to be suitable for generating the datasets. So we make required modifications to it. Then, we generate several datasets for Slammer, Code Red I, Code Red II and modified versions of these worms in different scenarios using our technique and make them publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes