LGNIJun 10, 2025

When Simple Model Just Works: Is Network Traffic Classification in Crisis?

arXiv:2506.08655v11 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work highlights a critical issue for network security and management researchers, exposing that standard ML practices may be misapplied in traffic classification, potentially stalling progress in the field.

The paper tackles the problem of inflated performance in network traffic classification by showing that simple k-NN baselines match or beat complex neural networks, revealing that over 50% of dataset samples are redundant due to identical packet sequences, which leads to overestimated accuracy and conflicts in labels.

Machine learning has been applied to network traffic classification (TC) for over two decades. While early efforts used shallow models, the latter 2010s saw a shift toward complex neural networks, often reporting near-perfect accuracy. However, it was recently revealed that a simple k-NN baseline using packet sequences metadata (sizes, times, and directions) can be on par or even outperform more complex methods. In this paper, we investigate this phenomenon further and evaluate this baseline across 12 datasets and 15 TC tasks, and investigate why it performs so well. Our analysis shows that most datasets contain over 50% redundant samples (identical packet sequences), which frequently appear in both training and test sets due to common splitting practices. This redundancy can lead to overestimated model performance and reduce the theoretical maximum accuracy when identical flows have conflicting labels. Given its distinct characteristics, we further argue that standard machine learning practices adapted from domains like NLP or computer vision may be ill-suited for TC. Finally, we propose new directions for task formulation and evaluation to address these challenges and help realign the field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes