All Infections are Not Created Equal: Time-Sensitive Prediction of Malware Generated Network Attacks
This work provides an early warning system for network administrators to prevent malware attacks and minimize user inconvenience by predicting both the occurrence and timing of attacks.
This paper addresses the problem of predicting malware-generated network attacks before they manifest, and also predicting the time of these attacks. By modeling infection sequences as Markov and Semi-Markov chains, the authors achieved 98% prediction accuracy for spamming and port-scanning attacks before they occur, and 97% accuracy in foretelling the time of these attacks.
Many techniques have been proposed for quickly detecting and containing malware-generated network attacks such as large-scale denial of service attacks; unfortunately, much damage is already done within the first few minutes of an attack, before it is identified and contained. There is a need for an early warning system that can predict attacks before they actually manifest, so that upcoming attacks can be prevented altogether by blocking the hosts that are likely to engage in attacks. However, blocking responses may disrupt legitimate processes on blocked hosts; in order to minimise user inconvenience, it is important to also foretell the time when the predicted attacks will occur, so that only the most urgent threats result in auto-blocking responses, while less urgent ones are first manually investigated. To this end, we identify a typical infection sequence followed by modern malware; modelling this sequence as a Markov chain and training it on real malicious traffic, we are able to identify behaviour most likely to lead to attacks and predict 98\% of real-world spamming and port-scanning attacks before they occur. Moreover, using a Semi-Markov chain model, we are able to foretell the time of upcoming attacks, a novel capability that allows accurately predicting the times of 97% of real-world malware attacks. Our work represents an important and timely step towards enabling flexible threat response models that minimise disruption to legitimate users.