CRApr 22

Image-Based Malware Type Classification on MalNet-Image Tiny: Effects of Multi-Scale Fusion, Transfer Learning, Data Augmentation, and Schedule-Free Optimization

Ahmed A. Abouelkhaire, Waleed A. Yousef, Issa Traor

arXiv:2604.2115315.2h-index: 2

AI Analysis

For researchers working on malware classification from image representations, this paper provides an ablation study showing that pretraining and augmentation yield the largest gains, while FPN improves precision and AUC.

This paper studies 43-class malware type classification on MalNet-Image Tiny, achieving macro-F1 of 0.6927 with the best configuration combining pretraining, Mixup, TrivialAugment, and FPN, compared to a baseline of 0.6510.

This paper studies 43-class malware type classification on MalNet-Image Tiny, a public benchmark derived from Android APK files. The goal is to assess whether a compact image classifier benefits from four components evaluated in a controlled ablation: a feature pyramid network (FPN) for scale variation induced by resizing binaries of different lengths, ImageNet pretraining, lightweight augmentation through Mixup and TrivialAugment, and schedule-free AdamW optimization. All experiments use a ResNet18 backbone and the provided train/validation/test split. Reproducing the benchmark-style configuration yields macro-F1 (F1_macro) of 0.6510, consistent with the reported baseline of approximately 0.65. Replacing the optimizer with schedule-free AdamW and using unweighted cross-entropy increases F1_macro to 0.6535 in 10 epochs, compared with 96 epochs for the reproduced baseline. The best configuration combines pretraining, Mixup, TrivialAugment, and FPN, reaching F1_macro=0.6927, P_macro=0.7707, AUC_macro=0.9556, and L_test=0.8536. The ablation indicates that the largest gains in F1_macro arise from pretraining and augmentation, whereas FPN mainly improves P_macro, AUC_macro, and L_test in the strongest configuration.

View on arXiv PDF

Similar