CR AIOct 12, 2024

A Novel Approach to Malicious Code Detection Using CNN-BiLSTM and Feature Fusion

Lixia Zhang, Tianxu Liu, Kaihui Shen, Cheng Chen

arXiv:2410.09401v15 citationsh-index: 1RICAI

Originality Incremental advance

AI Analysis

It addresses the urgent need for efficient malware detection to protect individual privacy and critical infrastructures, though it appears incremental as it builds on existing methods like CNN and BiLSTM with feature fusion.

This paper tackles the problem of detecting malware, which threatens computer systems and network security, by proposing a novel approach that combines CNN-BiLSTM with feature fusion, resulting in significant improvements in accuracy, recall, and F1 score on public datasets, especially for variants and obfuscated malware.

With the rapid advancement of Internet technology, the threat of malware to computer systems and network security has intensified. Malware affects individual privacy and security and poses risks to critical infrastructures of enterprises and nations. The increasing quantity and complexity of malware, along with its concealment and diversity, challenge traditional detection techniques. Static detection methods struggle against variants and packed malware, while dynamic methods face high costs and risks that limit their application. Consequently, there is an urgent need for novel and efficient malware detection techniques to improve accuracy and robustness. This study first employs the minhash algorithm to convert binary files of malware into grayscale images, followed by the extraction of global and local texture features using GIST and LBP algorithms. Additionally, the study utilizes IDA Pro to decompile and extract opcode sequences, applying N-gram and tf-idf algorithms for feature vectorization. The fusion of these features enables the model to comprehensively capture the behavioral characteristics of malware. In terms of model construction, a CNN-BiLSTM fusion model is designed to simultaneously process image features and opcode sequences, enhancing classification performance. Experimental validation on multiple public datasets demonstrates that the proposed method significantly outperforms traditional detection techniques in terms of accuracy, recall, and F1 score, particularly in detecting variants and obfuscated malware with greater stability. The research presented in this paper offers new insights into the development of malware detection technologies, validating the effectiveness of feature and model fusion, and holds promising application prospects.

View on arXiv PDF

Similar