Rank-1 Similarity Matrix Decomposition For Modeling Changes in Antivirus Consensus Through Time
This addresses the problem of understanding antivirus consensus dynamics for cybersecurity researchers, but it is incremental as it refines existing theories without a major breakthrough.
The study challenged the belief that antivirus engine correlations stem mainly from copying leading vendors, using 25 million VirusTotal reports over a decade, and found that first-order interactions explain less behavior than thought, with relationships being highly volatile.
Although groups of strongly correlated antivirus engines are known to exist, at present there is limited understanding of how or why these correlations came to be. Using a corpus of 25 million VirusTotal reports representing over a decade of antivirus scan data, we challenge prevailing wisdom that these correlations primarily originate from "first-order" interactions such as antivirus vendors copying the labels of leading vendors. We introduce the Temporal Rank-1 Similarity Matrix decomposition (R1SM-T) in order to investigate the origins of these correlations and to model how consensus amongst antivirus engines changes over time. We reveal that first-order interactions do not explain as much behavior in antivirus correlation as previously thought, and that the relationships between antivirus engines are highly volatile. We make recommendations on items in need of future study and consideration based on our findings.