Controlled Data Sharing for Collaborative Predictive Blacklisting
This addresses the challenge of data sharing for cybersecurity among organizations, which is often hindered by confidentiality and trust issues, offering a practical solution with incremental improvements in threat prediction.
The paper tackles the problem of enabling collaborative cybersecurity threat mitigation by proposing a controlled data sharing approach that allows organizations to estimate collaboration benefits and share data privately using cryptographic tools, achieving up to 105% accuracy improvement on average in predictive blacklisting.
Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate.