Toward Generative Data Augmentation for Traffic Classification
This work addresses the problem of limited data augmentation in networking for traffic classification, though it is incremental as it applies existing methods to a new dataset.
The authors tackled the underutilization of data augmentation in traffic classification by applying 14 hand-crafted methods to the MIRAGE19 dataset, showing that DA can improve model performance in this domain.
Data Augmentation (DA)-augmenting training data with synthetic samples-is wildly adopted in Computer Vision (CV) to improve models performance. Conversely, DA has not been yet popularized in networking use cases, including Traffic Classification (TC). In this work, we present a preliminary study of 14 hand-crafted DAs applied on the MIRAGE19 dataset. Our results (i) show that DA can reap benefits previously unexplored in TC and (ii) foster a research agenda on the use of generative models to automate DA design.