CRAIOct 20, 2025

ParaVul: A Parallel Large Language Model and Retrieval-Augmented Framework for Smart Contract Vulnerability Detection

arXiv:2510.17919v1h-index: 19
Originality Incremental advance
AI Analysis

This addresses security threats in blockchain by improving detection accuracy, though it is incremental as it builds on existing LLM and RAG methods.

The paper tackles smart contract vulnerability detection by proposing ParaVul, a parallel LLM and retrieval-augmented framework, achieving F1 scores of 0.9398 for single-label and 0.9330 for multi-label detection.

Smart contracts play a significant role in automating blockchain services. Nevertheless, vulnerabilities in smart contracts pose serious threats to blockchain security. Currently, traditional detection methods primarily rely on static analysis and formal verification, which can result in high false-positive rates and poor scalability. Large Language Models (LLMs) have recently made significant progress in smart contract vulnerability detection. However, they still face challenges such as high inference costs and substantial computational overhead. In this paper, we propose ParaVul, a parallel LLM and retrieval-augmented framework to improve the reliability and accuracy of smart contract vulnerability detection. Specifically, we first develop Sparse Low-Rank Adaptation (SLoRA) for LLM fine-tuning. SLoRA introduces sparsification by incorporating a sparse matrix into quantized LoRA-based LLMs, thereby reducing computational overhead and resource requirements while enhancing their ability to understand vulnerability-related issues. We then construct a vulnerability contract dataset and develop a hybrid Retrieval-Augmented Generation (RAG) system that integrates dense retrieval with Best Matching 25 (BM25), assisting in verifying the results generated by the LLM. Furthermore, we propose a meta-learning model to fuse the outputs of the RAG system and the LLM, thereby generating the final detection results. After completing vulnerability detection, we design chain-of-thought prompts to guide LLMs to generate comprehensive vulnerability detection reports. Simulation results demonstrate the superiority of ParaVul, especially in terms of F1 scores, achieving 0.9398 for single-label detection and 0.9330 for multi-label detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes