CRLGMar 7, 2021

Cluster Analysis of Malware Family Relationships

arXiv:2103.05761v15 citations
Originality Synthesis-oriented
AI Analysis

This work addresses malware analysis for cybersecurity researchers, but it is incremental as it applies an existing method to new data.

The paper tackled the problem of analyzing relationships between malware families and types using K-means clustering on a dataset of 20 families with 1000 samples each, and found that K-means is a powerful tool for such data exploration.

In this paper, we use $K$-means clustering to analyze various relationships between malware samples. We consider a dataset comprising~20 malware families with~1000 samples per family. These families can be categorized into seven different types of malware. We perform clustering based on pairs of families and use the results to determine relationships between families. We perform a similar cluster analysis based on malware type. Our results indicate that $K$-means clustering can be a powerful tool for data exploration of malware family relationships.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes