CR LG MLMar 3, 2021

Malware Classification with GMM-HMM Models

Jing Zhao, Samanvitha Basole, Mark Stamp

arXiv:2103.02753v110.710 citations

Originality Synthesis-oriented

AI Analysis

This work addresses malware classification for cybersecurity applications, but it is incremental as it adapts an existing method (GMM-HMMs) to a new domain with mixed improvements.

The paper tackled malware classification by applying Gaussian mixture model-HMMs (GMM-HMMs) to opcode and entropy-based sequences, finding that GMM-HMMs performed comparably to discrete HMMs on opcode features but significantly improved classification results on entropy-based features.

Discrete hidden Markov models (HMM) are often applied to malware detection and classification problems. However, the continuous analog of discrete HMMs, that is, Gaussian mixture model-HMMs (GMM-HMM), are rarely considered in the field of cybersecurity. In this paper, we use GMM-HMMs for malware classification and we compare our results to those obtained using discrete HMMs. As features, we consider opcode sequences and entropy-based sequences. For our opcode features, GMM-HMMs produce results that are comparable to those obtained using discrete HMMs, whereas for our entropy-based features, GMM-HMMs generally improve significantly on the classification results that we have achieved with discrete HMMs.

View on arXiv PDF

Similar