CRSep 20, 2017

Automatic Detection of Malware-Generated Domains with Recurrent Neural Models

arXiv:1709.07102v169 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of defending against malware command-and-control servers for cybersecurity practitioners, though it is incremental as it applies an existing neural method to a specific domain.

The paper tackled the problem of detecting malware-generated domains from domain-generation algorithms (DGAs) using a recurrent neural network approach, achieving a high F1 score of 0.971 and detecting 93% of such domains with a false positive rate of 1:100.

Modern malware families often rely on domain-generation algorithms (DGAs) to determine rendezvous points to their command-and-control server. Traditional defence strategies (such as blacklisting domains or IP addresses) are inadequate against such techniques due to the large and continuously changing list of domains produced by these algorithms. This paper demonstrates that a machine learning approach based on recurrent neural networks is able to detect domain names generated by DGAs with high precision. The neural models are estimated on a large training set of domains generated by various malwares. Experimental results show that this data-driven approach can detect malware-generated domain names with a F_1 score of 0.971. To put it differently, the model can automatically detect 93 % of malware-generated domain names for a false positive rate of 1:100.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes