CRLGJun 21, 2016

Contextual Weisfeiler-Lehman Graph Kernel For Malware Detection

arXiv:1606.06369v127 citations
Originality Incremental advance
AI Analysis

It addresses malware detection in cybersecurity, offering improved accuracy for large-scale real-world applications, but is incremental as it builds on existing graph kernels.

The paper tackles malware detection by proposing a novel graph kernel, CWLK, that captures both structural and contextual information in program graphs, achieving over 5.27% and 4.87% higher F-measure than state-of-the-art methods on a dataset of 50,000 Android apps.

In this paper, we propose a novel graph kernel specifically to address a challenging problem in the field of cyber-security, namely, malware detection. Previous research has revealed the following: (1) Graph representations of programs are ideally suited for malware detection as they are robust against several attacks, (2) Besides capturing topological neighbourhoods (i.e., structural information) from these graphs it is important to capture the context under which the neighbourhoods are reachable to accurately detect malicious neighbourhoods. We observe that state-of-the-art graph kernels, such as Weisfeiler-Lehman kernel (WLK) capture the structural information well but fail to capture contextual information. To address this, we develop the Contextual Weisfeiler-Lehman kernel (CWLK) which is capable of capturing both these types of information. We show that for the malware detection problem, CWLK is more expressive and hence more accurate than WLK while maintaining comparable efficiency. Through our large-scale experiments with more than 50,000 real-world Android apps, we demonstrate that CWLK outperforms two state-of-the-art graph kernels (including WLK) and three malware detection techniques by more than 5.27% and 4.87% F-measure, respectively, while maintaining high efficiency. This high accuracy and efficiency make CWLK suitable for large-scale real-world malware detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes