CRAILGSep 19, 2022

Cross Project Software Vulnerability Detection via Domain Adaptation and Max-Margin Principle

arXiv:2209.10406v14 citationsh-index: 40Has Code
Originality Incremental advance
AI Analysis

This work addresses software vulnerability detection for security applications, presenting an incremental advance with domain adaptation and max-margin methods.

The paper tackled software vulnerability detection by addressing automatic representation learning and labeled data scarcity, achieving an F1-measure improvement of 1.83% to 6.25% over state-of-the-art baselines.

Software vulnerabilities (SVs) have become a common, serious and crucial concern due to the ubiquity of computer software. Many machine learning-based approaches have been proposed to solve the software vulnerability detection (SVD) problem. However, there are still two open and significant issues for SVD in terms of i) learning automatic representations to improve the predictive performance of SVD, and ii) tackling the scarcity of labeled vulnerabilities datasets that conventionally need laborious labeling effort by experts. In this paper, we propose a novel end-to-end approach to tackle these two crucial issues. We first exploit the automatic representation learning with deep domain adaptation for software vulnerability detection. We then propose a novel cross-domain kernel classifier leveraging the max-margin principle to significantly improve the transfer learning process of software vulnerabilities from labeled projects into unlabeled ones. The experimental results on real-world software datasets show the superiority of our proposed method over state-of-the-art baselines. In short, our method obtains a higher performance on F1-measure, the most important measure in SVD, from 1.83% to 6.25% compared to the second highest method in the used datasets. Our released source code samples are publicly available at https://github.com/vannguyennd/dam2p

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes