CR SEJun 19, 2021

Vulnerability Detection with Fine-grained Interpretations

arXiv:2106.10478v1338 citations

Originality Incremental advance

AI Analysis

This addresses the need for more interpretable vulnerability detection in software security, offering incremental improvements over existing methods.

The paper tackles the problem of limited interpretability in machine learning-based vulnerability detectors by introducing IVDetect, which provides fine-grained interpretations of vulnerable statements, resulting in performance improvements of 43%–84% in top-10 nDCG and 67% accuracy in pointing out relevant statements.

Despite the successes of machine learning (ML) and deep learning (DL) based vulnerability detectors (VD), they are limited to providing only the decision on whether a given code is vulnerable or not, without details on what part of the code is relevant to the detected vulnerability. We present IVDetect an interpretable vulnerability detector with the philosophy of using Artificial Intelligence (AI) to detect vulnerabilities, while using Intelligence Assistant (IA) via providing VD interpretations in terms of vulnerable statements. For vulnerability detection, we separately consider the vulnerable statements and their surrounding contexts via data and control dependencies. This allows our model better discriminate vulnerable statements than using the mixture of vulnerable code and~contextual code as in existing approaches. In addition to the coarse-grained vulnerability detection result, we leverage interpretable AI to provide users with fine-grained interpretations that include the sub-graph in the Program Dependency Graph (PDG) with the crucial statements that are relevant to the detected vulnerability. Our empirical evaluation on vulnerability databases shows that IVDetect outperforms the existing DL-based approaches by 43%--84% and 105%--255% in top-10 nDCG and MAP ranking scores. IVDetect correctly points out the vulnerable statements relevant to the vulnerability via its interpretation~in 67% of the cases with a top-5 ranked list. It improves over baseline interpretation models by 12.3%--400% and 9%--400% in accuracy.

View on arXiv PDF

Similar