SEAIDec 11, 2024

EvalSVA: Multi-Agent Evaluators for Next-Gen Software Vulnerability Assessment

arXiv:2501.14737v16 citationsh-index: 8
Originality Incremental advance
AI Analysis

It addresses software vulnerability assessment for developers, offering an incremental improvement through a multi-agent approach.

The paper tackles the problem of software vulnerability assessment, which is challenging due to complexity and limited labeled data, by introducing EvalSVA, a multi-agent evaluators team using LLMs, and shows it outperforms previous methods with average accuracy of 44.12% and F1 of 43.29%.

Software Vulnerability (SV) assessment is a crucial process of determining different aspects of SVs (e.g., attack vectors and scope) for developers to effectively prioritize efforts in vulnerability mitigation. It presents a challenging and laborious process due to the complexity of SVs and the scarcity of labeled data. To mitigate the above challenges, we introduce EvalSVA, a multi-agent evaluators team to autonomously deliberate and evaluate various aspects of SV assessment. Specifically, we propose a multi-agent-based framework to simulate vulnerability assessment strategies in real-world scenarios, which employs multiple Large Language Models (LLMs) into an integrated group to enhance the effectiveness of SV assessment in the limited data. We also design diverse communication strategies to autonomously discuss and assess different aspects of SV. Furthermore, we construct a multi-lingual SV assessment dataset based on the new standard of CVSS, comprising 699, 888, and 1,310 vulnerability-related commits in C++, Python, and Java, respectively. Our experimental results demonstrate that EvalSVA averagely outperforms the 44.12\% accuracy and 43.29\% F1 for SV assessment compared with the previous methods. It shows that EvalSVA offers a human-like process and generates both reason and answer for SV assessment. EvalSVA can also aid human experts in SV assessment, which provides more explanation and details for SV assessment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes