CRAICLLGApr 13, 2023

Automated Mapping of CVE Vulnerability Records to MITRE CWE Weaknesses

arXiv:2304.11130v110 citationsh-index: 47
Originality Incremental advance
AI Analysis

This work addresses the time-consuming manual process of vulnerability mapping for cybersecurity professionals, though it is incremental as it builds on existing machine learning techniques with a new dataset.

The authors tackled the problem of automatically mapping CVE vulnerability records to MITRE CWE weaknesses by creating a manually annotated dataset of 4,012 records and using fine-tuned deep learning models like Sentence-BERT and rankT5, which showed sizable performance gains over baseline methods such as BM25, BERT, and RoBERTa.

In recent years, a proliferation of cyber-security threats and diversity has been on the rise culminating in an increase in their reporting and analysis. To counter that, many non-profit organizations have emerged in this domain, such as MITRE and OSWAP, which have been actively tracking vulnerabilities, and publishing defense recommendations in standardized formats. As producing data in such formats manually is very time-consuming, there have been some proposals to automate the process. Unfortunately, a major obstacle to adopting supervised machine learning for this problem has been the lack of publicly available specialized datasets. Here, we aim to bridge this gap. In particular, we focus on mapping CVE records into MITRE CWE Weaknesses, and we release to the research community a manually annotated dataset of 4,012 records for this task. With a human-in-the-loop framework in mind, we approach the problem as a ranking task and aim to incorporate reinforced learning to make use of the human feedback in future work. Our experimental results using fine-tuned deep learning models, namely Sentence-BERT and rankT5, show sizable performance gains over BM25, BERT, and RoBERTa, which demonstrates the need for an architecture capable of good semantic understanding for this task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes