Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic
This work addresses anomaly detection for cybersecurity applications, but it is incremental as it applies existing methods to network traffic data.
The paper tackled the problem of detecting anomalies in computer network traffic by evaluating sequence aggregation rules for unsupervised anomaly detection, using CICIDS2017 dataset, and found that a frequency-based model performed as well as or better than LSTM models in most cases.
We evaluate methods for applying unsupervised anomaly detection to cybersecurity applications on computer network traffic data, or flow. We borrow from the natural language processing literature and conceptualize flow as a sort of "language" spoken between machines. Five sequence aggregation rules are evaluated for their efficacy in flagging multiple attack types in a labeled flow dataset, CICIDS2017. For sequence modeling, we rely on long short-term memory (LSTM) recurrent neural networks (RNN). Additionally, a simple frequency-based model is described and its performance with respect to attack detection is compared to the LSTM models. We conclude that the frequency-based model tends to perform as well as or better than the LSTM models for the tasks at hand, with a few notable exceptions.