LGOct 4, 2023

Comparative Analysis of Imbalanced Malware Byteplot Image Classification using Transfer Learning

Jayasudha M, Ayesha Shaik, Gaurav Pendharkar, Soham Kumar, Muhesh Kumar B, Sudharshanan Balaji

arXiv:2310.02742v19 citationsh-index: 9

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of class imbalance in malware detection for cybersecurity applications, but it is incremental as it applies existing transfer learning models to new datasets.

The paper compared six multiclass classification models on malware byteplot image datasets to analyze the impact of class imbalance on performance and convergence, finding that higher imbalance reduces epochs needed for convergence and that ResNet50, EfficientNetB0, and DenseNet169 handle both imbalanced and balanced data well, with a maximum precision of 97% on imbalanced data.

Cybersecurity is a major concern due to the increasing reliance on technology and interconnected systems. Malware detectors help mitigate cyber-attacks by comparing malware signatures. Machine learning can improve these detectors by automating feature extraction, identifying patterns, and enhancing dynamic analysis. In this paper, the performance of six multiclass classification models is compared on the Malimg dataset, Blended dataset, and Malevis dataset to gain insights into the effect of class imbalance on model performance and convergence. It is observed that the more the class imbalance less the number of epochs required for convergence and a high variance across the performance of different models. Moreover, it is also observed that for malware detectors ResNet50, EfficientNetB0, and DenseNet169 can handle imbalanced and balanced data well. A maximum precision of 97% is obtained for the imbalanced dataset, a maximum precision of 95% is obtained on the intermediate imbalance dataset, and a maximum precision of 95% is obtained for the perfectly balanced dataset.

View on arXiv PDF

Similar