CRAINov 13, 2015

Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification

arXiv:1511.04317v2370 citations
Originality Highly original
AI Analysis

This addresses the challenge of malware categorization for the computer security community, which is incremental as it builds on existing datasets and methods.

The paper tackles the problem of classifying malware variants into families by developing a novel paradigm for feature extraction, selection, and fusion, achieving an accuracy of approximately 0.998 on the Microsoft Malware Challenge dataset.

Modern malware is designed with mutation characteristics, namely polymorphism and metamorphism, which causes an enormous growth in the number of variants of malware samples. Categorization of malware samples on the basis of their behaviors is essential for the computer security community, because they receive huge number of malware everyday, and the signature extraction process is usually based on malicious parts characterizing malware families. Microsoft released a malware classification challenge in 2015 with a huge dataset of near 0.5 terabytes of data, containing more than 20K malware samples. The analysis of this dataset inspired the development of a novel paradigm that is effective in categorizing malware variants into their actual family groups. This paradigm is presented and discussed in the present paper, where emphasis has been given to the phases related to the extraction, and selection of a set of novel features for the effective representation of malware samples. Features can be grouped according to different characteristics of malware behavior, and their fusion is performed according to a per-class weighting paradigm. The proposed method achieved a very high accuracy ($\approx$ 0.998) on the Microsoft Malware Challenge dataset.

Code Implementations19 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes