Farnaz Soltaniani

CR
h-index10
3papers
6citations
Novelty27%
AI Score31

3 Papers

CRJan 30
From Data Leak to Secret Misses: The Impact of Data Leakage on Secret Detection Models

Farnaz Soltaniani, Mohammad Ghafari

Machine learning models are increasingly used for software security tasks. These models are commonly trained and evaluated on large Internet-derived datasets, which often contain duplicated or highly similar samples. When such samples are split across training and test sets, data leakage may occur, allowing models to memorize patterns instead of learning to generalize. We investigate duplication in a widely used benchmark dataset of hard coded secrets and show how data leakage can substantially inflate the reported performance of AI-based secret detectors, resulting in a misleading picture of their real-world effectiveness.

CRJan 30
Evaluating Large Language Models for Security Bug Report Prediction

Farnaz Soltaniani, Shoaib Razzaq, Mohammad Ghafari

Early detection of security bug reports (SBRs) is critical for timely vulnerability mitigation. We present an evaluation of prompt-based engineering and fine-tuning approaches for predicting SBRs using Large Language Models (LLMs). Our findings reveal a distinct trade-off between the two approaches. Prompted proprietary models demonstrate the highest sensitivity to SBRs, achieving a G-measure of 77% and a recall of 74% on average across all the datasets, albeit at the cost of a higher false-positive rate, resulting in an average precision of only 22%. Fine-tuned models, by contrast, exhibit the opposite behavior, attaining a lower overall G-measure of 51% but substantially higher precision of 75% at the cost of reduced recall of 36%. Though a one-time investment in building fine-tuned models is necessary, the inference on the largest dataset is up to 50 times faster than that of proprietary models. These findings suggest that further investigations to harness the power of LLMs for SBR prediction are necessary.

CRApr 28, 2025
Security Bug Report Prediction Within and Across Projects: A Comparative Study of BERT and Random Forest

Farnaz Soltaniani, Mohammad Ghafari, Mohammed Sayagh

Early detection of security bug reports (SBRs) is crucial for preventing vulnerabilities and ensuring system reliability. While machine learning models have been developed for SBR prediction, their predictive performance still has room for improvement. In this study, we conduct a comprehensive comparison between BERT and Random Forest (RF), a competitive baseline for predicting SBRs. The results show that RF outperforms BERT with a 34% higher average G-measure for within-project predictions. Adding only SBRs from various projects improves both models' average performance. However, including both security and nonsecurity bug reports significantly reduces RF's average performance to 46%, while boosts BERT to its best average performance of 66%, surpassing RF. In cross-project SBR prediction, BERT achieves a remarkable 62% G-measure, which is substantially higher than RF.