OutCenTR: A novel semi-supervised framework for predicting exploits of vulnerabilities in high-dimensional datasets
This work addresses the critical task for system administrators of prioritizing vulnerabilities to patch, though it appears incremental as it enhances existing outlier detection methods.
The paper tackles the problem of predicting which vulnerabilities are likely to be exploited in high-dimensional datasets like the National Vulnerability Database, using a novel semi-supervised framework called OutCenTR that improves outlier detection, resulting in an average 5-fold improvement in F1 score compared to state-of-the-art techniques.
An ever-growing number of vulnerabilities are reported every day. Yet these vulnerabilities are not all the same; Some are more targeted than others. Correctly estimating the likelihood of a vulnerability being exploited is a critical task for system administrators. This aids the system administrators in prioritizing and patching the right vulnerabilities. Our work makes use of outlier detection techniques to predict vulnerabilities that are likely to be exploited in highly imbalanced and high-dimensional datasets such as the National Vulnerability Database. We propose a dimensionality reduction technique, OutCenTR, that enhances the baseline outlier detection models. We further demonstrate the effectiveness and efficiency of OutCenTR empirically with 4 benchmark and 12 synthetic datasets. The results of our experiments show on average a 5-fold improvement of F1 score in comparison with state-of-the-art dimensionality reduction techniques such as PCA and GRP.