CR LGMar 24, 2021

CNN vs ELM for Image-Based Malware Classification

Mugdha Jain, William Andreopoulos, Mark Stamp

arXiv:2103.13820v16.615 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of costly feature extraction in malware classification for security researchers, though it is incremental as it compares existing methods on a new data representation.

The paper tackled malware classification by visualizing malware as images and comparing Convolutional Neural Networks (CNNs) and Extreme Learning Machines (ELMs), finding that ELMs achieve similar accuracies to CNNs while requiring less than 2% of the training time.

Research in the field of malware classification often relies on machine learning models that are trained on high-level features, such as opcodes, function calls, and control flow graphs. Extracting such features is costly, since disassembly or code execution is generally required. In this paper, we conduct experiments to train and evaluate machine learning models for malware classification, based on features that can be obtained without disassembly or execution of code. Specifically, we visualize malware samples as images and employ image analysis techniques. In this context, we focus on two machine learning models, namely, Convolutional Neural Networks (CNN) and Extreme Learning Machines (ELM). Surprisingly, we find that ELMs can achieve accuracies on par with CNNs, yet ELM training requires less than~2\%\ of the time needed to train a comparable CNN.

View on arXiv PDF

Similar