Graphing Website Relationships for Risk Prediction: Identifying Derived Threats to Users Based on Known Indicators
This work addresses cybersecurity risk prediction for web users by leveraging graph-based relationships, but it is incremental as it builds on known threat indicators and existing methods.
The study tackled the problem of predicting website risk by analyzing referrer link relationships and hop distances to malicious sites, achieving true positive rates of 58.59% to 63.45% and false positive rates of 7.42% to 37.50%.
The hypothesis for the study was that the relationship based on referrer links and the number of hops to a malicious site could indicate the risk to another website. We chose Receiver Operating Characteristics (ROC) analysis as the method of comparing true positive and false positive rates for captured web traffic to test the predictive capabilities of our model. Known threat indicators were used as designators, and the Neo4j graph database was leveraged to map the relationships between other websites based on referring links. Using the referring traffic, we mapped user visits across websites with a known relationship to track the rate at which users progressed from a non-malicious website to a known threat. The results were grouped by the hop distance from the known threat to calculate the predictive rate. The results of the model produced true positive rates between 58.59% and 63.45% and false positive rates between 7.42% to 37.50%, respectively. The true and false positive rates suggest an improved performance based on the closer proximity from the known threat, while an increased referring distance from the threat resulted in higher rates of false positives.