MalGrid: Visualization Of Binary Features In Large Malware Corpora
This provides a tool for cybersecurity analysts to rapidly screen and understand malware relationships, though it is incremental as it builds on existing visualization and feature representation methods.
The authors tackled the problem of visualizing large malware corpora by developing MalGrid, a system that maps millions of malware samples to a 2D spatial grid to reveal relationships and aid in triage solutions, with a case study showing correlation between packing effects and algorithm complexity.
The number of malware is constantly on the rise. Though most new malware are modifications of existing ones, their sheer number is quite overwhelming. In this paper, we present a novel system to visualize and map millions of malware to points in a 2-dimensional (2D) spatial grid. This enables visualizing relationships within large malware datasets that can be used to develop triage solutions to screen different malware rapidly and provide situational awareness. Our approach links two visualizations within an interactive display. Our first view is a spatial point-based visualization of similarity among the samples based on a reduced dimensional projection of binary feature representations of malware. Our second spatial grid-based view provides a better insight into similarities and differences between selected malware samples in terms of the binary-based visual representations they share. We also provide a case study where the effect of packing on the malware data is correlated with the complexity of the packing algorithm.