DANTE: A framework for mining and monitoring darknet traffic
This addresses the challenge of monitoring large-scale darknet traffic for cybersecurity professionals, offering a practical tool for threat detection.
The paper tackles the problem of mining threat intelligence from darknet traffic by presenting DANTE, a framework that uses Word2Vec on port sequences and incremental clustering to detect malicious behaviors; it discovered 1,177 new emerging threats in a year-long dataset.
Trillions of network packets are sent over the Internet to destinations which do not exist. This 'darknet' traffic captures the activity of botnets and other malicious campaigns aiming to discover and compromise devices around the world. In order to mine threat intelligence from this data, one must be able to handle large streams of logs and represent the traffic patterns in a meaningful way. However, by observing how network ports (services) are used, it is possible to capture the intent of each transmission. In this paper, we present DANTE: a framework and algorithm for mining darknet traffic. DANTE learns the meaning of targeted network ports by applying Word2Vec to observed port sequences. Then, when a host sends a new sequence, DANTE represents the transmission as the average embedding of the ports found that sequence. Finally, DANTE uses a novel and incremental time-series cluster tracking algorithm on observed sequences to detect recurring behaviors and new emerging threats. To evaluate the system, we ran DANTE on a full year of darknet traffic (over three Tera-Bytes) collected by the largest telecommunications provider in Europe, Deutsche Telekom and analyzed the results. DANTE discovered 1,177 new emerging threats and was able to track malicious campaigns over time. We also compared DANTE to the current best approach and found DANTE to be more practical and effective at detecting darknet traffic patterns.