CRJun 3
TeleHunt: A Framework and Tool for Efficient Cybercriminal Community Discovery on TelegramRoy Ricaldi, Victor Asanache, Luca Allodi
This paper presents TeleHunt, a framework and tool for evaluating the effectiveness of different strategies to discover cybercriminal communities on Telegram. TeleHunt employs a set of reference-driven snowballing strategies, integrating message-level classification, contextual filtering, and market-segment labeling. Using open- and dark-web seeds, we systematically evaluate how seed source, pointer type, and exploration strategy influence discovery outcomes in three dimensions: efficiency, accessibility, and rediscovery. Our work provides (i) a modular cybercrime content discovery pipeline, (ii) the first systematic comparison of Telegram discovery strategies with an empirical characterization of market-segment accessibility, and (iii) a labeled dataset of over 172 million messages from 6,022 Telegram communities.
CRMay 14
Topical Shifts in the Dark Web: A Longitudinal Analysis of Content from the Cybercrime EcosystemRoy Ricaldi, Maximilian Schafer, Philipp Zech et al.
The dark web hosts a dynamic ecosystem of cybercrime forums and marketplaces that adapt to law enforcement pressure, technological change, and economic incentives. Prior research has extracted cyber threat intelligence from these platforms using static snapshots, with limited attention to how discussions evolve over time. In this study, we conduct a longitudinal analysis of 25,065 websites in the dark web using 11,403,638 HTML snapshots (approximately 1245.38 GB) collected over six years. We develop a longitudinal topic-modeling framework combining domain-specific embeddings, density-based clustering and temporal aggregation to measure topic prevalence and lifecycle at the website level. Our analysis identifies 55 thematic clusters. We find that approximately 75% of total discussion volume is concentrated in a small set of persistent core topics, while short-lived themes account for approximately 3% of activity. The median topic lifespan is 75 months, indicating gradual thematic evolution rather than abrupt replacement.
CROct 16, 2020Code
SAIBERSOC: Synthetic Attack Injection to Benchmark and Evaluate the Performance of Security Operation CentersMartin Rosso, Michele Campobasso, Ganduulga Gankhuyag et al.
In this paper we introduce SAIBERSOC, a tool and methodology enabling security researchers and operators to evaluate the performance of deployed and operational Security Operation Centers (SOCs) (or any other security monitoring infrastructure). The methodology relies on the MITRE ATT&CK Framework to define a procedure to generate and automatically inject synthetic attacks in an operational SOC to evaluate any output metric of interest (e.g., detection accuracy, time-to-investigation, etc.). To evaluate the effectiveness of the proposed methodology, we devise an experiment with $n=124$ students playing the role of SOC analysts. The experiment relies on a real SOC infrastructure and assigns students to either a BADSOC or a GOODSOC experimental condition. Our results show that the proposed methodology is effective in identifying variations in SOC performance caused by (minimal) changes in SOC configuration. We release the SAIBERSOC tool implementation as free and open source software.
CRJul 2, 2025
On the Effect of Ruleset Tuning and Data Imbalance on Explainable Network Security Alert Classifications: a Case-Study on DeepCASEKoen T. W. Teuwen, Sam Baggen, Emmanuele Zambon et al.
Automation in Security Operations Centers (SOCs) plays a prominent role in alert classification and incident escalation. However, automated methods must be robust in the presence of imbalanced input data, which can negatively affect performance. Additionally, automated methods should make explainable decisions. In this work, we evaluate the effect of label imbalance on the classification of network intrusion alerts. As our use-case we employ DeepCASE, the state-of-the-art method for automated alert classification. We show that label imbalance impacts both classification performance and correctness of the classification explanations offered by DeepCASE. We conclude tuning the detection rules used in SOCs can significantly reduce imbalance and may benefit the performance and explainability offered by alert post-processing methods such as DeepCASE. Therefore, our findings suggest that traditional methods to improve the quality of input data can benefit automation.
CRSep 17, 2020
CARONTE: Crawling Adversarial Resources Over Non-Trusted, High-Profile EnvironmentsMichele Campobasso, Pavlo Burda, Luca Allodi
The monitoring of underground criminal activities is often automated to maximize the data collection and to train ML models to automatically adapt data collection tools to different communities. On the other hand, sophisticated adversaries may adopt crawling-detection capabilities that may significantly jeopardize researchers' opportunities to perform the data collection, for example by putting their accounts under the spotlight and being expelled from the community. This is particularly undesirable in prominent and high-profile criminal communities where entry costs are significant (either monetarily or for example for background checking or other trust-building mechanisms). This paper presents CARONTE, a tool to semi-automatically learn virtually any forum structure for parsing and data-extraction, while maintaining a low profile for the data collection and avoiding the requirement of collecting massive datasets to maintain tool scalability. We showcase the tool against four underground forums, and compare the network traffic it generates (as seen from the adversary's position, i.e. the underground community's server) against state-of-the-art tools for web-crawling as well as human users.
CRSep 9, 2020
Impersonation-as-a-Service: Characterizing the Emerging Criminal Infrastructure for User Impersonation at ScaleMichele Campobasso, Luca Allodi
In this paper we provide evidence of an emerging criminal infrastructure enabling impersonation attacks at scale. Impersonation-as-a-Service (ImpaaS) allows attackers to systematically collect and enforce user profiles (consisting of user credentials, cookies, device and behavioural fingerprints, and other metadata) to circumvent risk-based authentication system and effectively bypass multi-factor authentication mechanisms. We present the ImpaaS model and evaluate its implementation by analysing the operation of a large, invite-only, Russian ImpaaS platform providing user profiles for more than $260'000$ Internet users worldwide. Our findings suggest that the ImpaaS model is growing, and provides the mechanisms needed to systematically evade authentication controls across multiple platforms, while providing attackers with a reliable, up-to-date, and semi-automated environment enabling target selection and user impersonation against Internet users as scale.
CRMay 6, 2019
Cognitive Triaging of Phishing AttacksAmber van der Heijden, Luca Allodi
In this paper we employ quantitative measurements of cognitive vulnerability triggers in phishing emails to predict the degree of success of an attack. To achieve this we rely on the cognitive psychology literature and develop an automated and fully quantitative method based on machine learning and econometrics to construct a triaging mechanism built around the cognitive features of a phishing email; we showcase our approach relying on data from the anti-phishing division of a large financial organization in Europe. Our evaluation shows empirically that an effective triaging mechanism for phishing success can be put in place by response teams to effectively prioritize remediation efforts (e.g. domain takedowns), by first acting on those attacks that are more likely to collect high response rates from potential victims.
CYAug 20, 2018
The Effect of Security Education and Expertise on Security Assessments: the Case of Software VulnerabilitiesLuca Allodi, Marco Cremonini, Fabio Massacci et al.
In spite of the growing importance of software security and the industry demand for more cyber security expertise in the workforce, the effect of security education and experience on the ability to assess complex software security problems has only been recently investigated. As proxy for the full range of software security skills, we considered the problem of assessing the severity of software vulnerabilities by means of a structured analysis methodology widely used in industry (i.e. the Common Vulnerability Scoring System (\CVSS) v3), and designed a study to compare how accurately individuals with background in information technology but different professional experience and education in cyber security are able to assess the severity of software vulnerabilities. Our results provide some structural insights into the complex relationship between education or experience of assessors and the quality of their assessments. In particular we find that individual characteristics matter more than professional experience or formal education; apparently it is the \emph{combination} of skills that one owns (including the actual knowledge of the system under study), rather than the specialization or the years of experience, to influence more the assessment quality. Similarly, we find that the overall advantage given by professional expertise significantly depends on the composition of the individual security skills as well as on the available information.
CRMay 24, 2018
A Bug Bounty Perspective on the Disclosure of Web VulnerabilitiesJukka Ruohonen, Luca Allodi
Bug bounties have become increasingly popular in recent years. This paper discusses bug bounties by framing these theoretically against so-called platform economy. Empirically the interest is on the disclosure of web vulnerabilities through the Open Bug Bounty (OBB) platform between 2015 and late 2017. According to the empirical results based on a dataset covering nearly 160 thousand web vulnerabilities, (i) OBB has been successful as a community-based platform for the dissemination of web vulnerabilities. The platform has also attracted many productive hackers, (ii) but there exists a large productivity gap, which likely relates to (iii) a knowledge gap and the use of automated tools for web vulnerability discovery. While the platform (iv) has been exceptionally fast to evaluate new vulnerability submissions, (v) the patching times of the web vulnerabilities disseminated have been long. With these empirical results and the accompanying theoretical discussion, the paper contributes to the small but rapidly growing amount of research on bug bounties. In addition, the paper makes a practical contribution by discussing the business models behind bug bounties from the viewpoints of platforms, ecosystems, and vulnerability markets.
CRMar 20, 2018
Identifying Relevant Information Cues for Vulnerability Assessment Using CVSSLuca Allodi, Sebastian Banescu, Henning Femmer et al.
The assessment of new vulnerabilities is an activity that accounts for information from several data sources and produces a `severity' score for the vulnerability. The Common Vulnerability Scoring System (\CVSS) is the reference standard for this assessment. Yet, no guidance currently exists on \emph{which information} aids a correct assessment and should therefore be considered. In this paper we address this problem by evaluating which information cues increase (or decrease) assessment accuracy. We devise a block design experiment with 67 software engineering students with varying vulnerability information and measure scoring accuracy under different information sets. We find that baseline vulnerability descriptions provided by standard vulnerability sources provide only part of the information needed to achieve an accurate vulnerability assessment. Further, we find that additional information on \texttt{assets}, \texttt{attacks}, and \texttt{vulnerability type} contributes in increasing the accuracy of the assessment; conversely, information on \texttt{known threats} misleads the assessor and decreases assessment accuracy and should be avoided when assessing vulnerabilities. These results go in the direction of formalizing the vulnerability communication to, for example, fully automate security assessments.
CRJan 15, 2018
Attack Potential in Impact and ComplexityLuca Allodi, Fabio Massacci
Vulnerability exploitation is reportedly one of the main attack vectors against computer systems. Yet, most vulnerabilities remain unexploited by attackers. It is therefore of central importance to identify vulnerabilities that carry a high `potential for attack'. In this paper we rely on Symantec data on real attacks detected in the wild to identify a trade-off in the Impact and Complexity of a vulnerability, in terms of attacks that it generates; exploiting this effect, we devise a readily computable estimator of the vulnerability's Attack Potential that reliably estimates the expected volume of attacks against the vulnerability. We evaluate our estimator performance against standard patching policies by measuring foiled attacks and demanded workload expressed as the number of vulnerabilities entailed to patch. We show that our estimator significantly improves over standard patching policies by ruling out low-risk vulnerabilities, while maintaining invariant levels of coverage against attacks in the wild. Our estimator can be used as a first aid for vulnerability prioritisation to focus assessment efforts on high-potential vulnerabilities.
CRJan 14, 2018
Towards Realistic Threat Modeling: Attack Commodification, Irrelevant Vulnerabilities, and Unrealistic AssumptionsLuca Allodi, Sandro Etalle
Current threat models typically consider all possible ways an attacker can penetrate a system and assign probabilities to each path according to some metric (e.g. time-to-compromise). In this paper we discuss how this view hinders the realness of both technical (e.g. attack graphs) and strategic (e.g. game theory) approaches of current threat modeling, and propose to steer away by looking more carefully at attack characteristics and attacker environment. We use a toy threat model for ICS attacks to show how a realistic view of attack instances can emerge from a simple analysis of attack phases and attacker limitations.
CRAug 16, 2017
Economic Factors of Vulnerability Trade and ExploitationLuca Allodi
Cybercrime markets support the development and diffusion of new attack technologies, vulnerability exploits, and malware. Whereas the revenue streams of cyber attackers have been studied multiple times in the literature, no quantitative account currently exists on the economics of attack acquisition and deployment. Yet, this understanding is critical to characterize the production of (traded) exploits, the economy that drives it, and its effects on the overall attack scenario. In this paper we provide an empirical investigation of the economics of vulnerability exploitation, and the effects of market factors on likelihood of exploit. Our data is collected first-handedly from a prominent Russian cybercrime market where the trading of the most active attack tools reported by the security industry happens. Our findings reveal that exploits in the underground are priced similarly or above vulnerabilities in legitimate bug-hunting programs, and that the refresh cycle of exploits is slower than currently often assumed. On the other hand, cybercriminals are becoming faster at introducing selected vulnerabilities, and the market is in clear expansion both in terms of players, traded exploits, and exploit pricing. We then evaluate the effects of these market variables on likelihood of attack realization, and find strong evidence of the correlation between market activity and exploit deployment. We discuss implications on vulnerability metrics, economics, and exploit measurement.
CRJan 7, 2013
My Software has a Vulnerability, should I worry?Luca Allodi, Fabio Massacci
(U.S) Rule-based policies to mitigate software risk suggest to use the CVSS score to measure the individual vulnerability risk and act accordingly: an HIGH CVSS score according to the NVD (National (U.S.) Vulnerability Database) is therefore translated into a "Yes". A key issue is whether such rule is economically sensible, in particular if reported vulnerabilities have been actually exploited in the wild, and whether the risk score do actually match the risk of actual exploitation. We compare the NVD dataset with two additional datasets, the EDB for the white market of vulnerabilities (such as those present in Metasploit), and the EKITS for the exploits traded in the black market. We benchmark them against Symantec's threat explorer dataset (SYM) of actual exploit in the wild. We analyze the whole spectrum of CVSS submetrics and use these characteristics to perform a case-controlled analysis of CVSS scores (similar to those used to link lung cancer and smoking) to test its reliability as a risk factor for actual exploitation. We conclude that (a) fixing just because a high CVSS score in NVD only yields negligible risk reduction, (b) the additional existence of proof of concepts exploits (e.g. in EDB) may yield some additional but not large risk reduction, (c) fixing in response to presence in black markets yields the equivalent risk reduction of wearing safety belt in cars (you might also die but still..). On the negative side, our study shows that as industry we miss a metric with high specificity (ruling out vulns for which we shouldn't worry). In order to address the feedback from BlackHat 2013's audience, the final revision (V3) provides additional data in Appendix A detailing how the control variables in the study affect the results.