CRJan 15, 2021

Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets

Jason Gray, Daniele Sgandurra, Lorenzo Cavallaro

arXiv:2101.06124v212.310 citations

Originality Synthesis-oriented

AI Analysis

It addresses the challenge of binary attribution for cybersecurity professionals, but it is incremental as it primarily reviews existing methods and provides a dataset.

This survey tackles the problem of attributing malware to its creators by analyzing authorship style in binaries, identifying adversarial techniques that hinder attribution, and publishing a dataset of 15,660 malware samples labeled to 164 threat actor groups to address data scarcity.

Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to identify authorship style. Our survey explores malicious author style and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on the state-of-the-art methods. We identify key findings and explore the open research challenges. To mitigate the lack of ground truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 15,660 malware labeled to 164 threat actor groups.

View on arXiv PDF

Similar