Damon McCoy

CR
h-index5
8papers
1,313citations
Novelty46%
AI Score36

8 Papers

CRJun 20, 2025
Tracker Installations Are Not Created Equal: Understanding Tracker Configuration of Form Data Collection

Julia B. Kieserman, Athanasios Andreou, Chris Geeng et al.

Targeted advertising is fueled by the comprehensive tracking of users' online activity. As a result, advertising companies, such as Google and Meta, encourage website administrators to not only install tracking scripts on their websites but configure them to automatically collect users' Personally Identifying Information (PII). In this study, we aim to characterize how Google and Meta's trackers can be configured to collect PII data from web forms. We first perform a qualitative analysis of how third parties present form data collection to website administrators in the documentation and user interface. We then perform a measurement study of 40,150 websites to quantify the prevalence and configuration of Google and Meta trackers. Our results reveal that both Meta and Google encourage the use of form data collection and include inaccurate statements about hashing PII as a privacy-preserving method. Additionally, we find that Meta includes configuring form data collection as part of the basic setup flow. Our large-scale measurement study reveals that while Google trackers are more prevalent than Meta trackers (72.6% vs. 28.2% of websites), Meta trackers are configured to collect form data more frequently (11.6% vs. 62.3%). Finally, we identify sensitive finance and health websites that have installed trackers that are likely configured to collect form data PII in violation of Meta and Google policies. Our study highlights how tracker documentation and interfaces can potentially play a role in users' privacy through the configuration choices made by the website administrators who install trackers.

CLJan 31, 2024
Global-Liar: Factuality of LLMs over Time and Geographic Regions

Shujaat Mirza, Bruno Coelho, Yuyuan Cui et al.

The increasing reliance on AI-driven solutions, particularly Large Language Models (LLMs) like the GPT series, for information retrieval highlights the critical need for their factuality and fairness, especially amidst the rampant spread of misinformation and disinformation online. Our study evaluates the factual accuracy, stability, and biases in widely adopted GPT models, including GPT-3.5 and GPT-4, contributing to reliability and integrity of AI-mediated information dissemination. We introduce 'Global-Liar,' a dataset uniquely balanced in terms of geographic and temporal representation, facilitating a more nuanced evaluation of LLM biases. Our analysis reveals that newer iterations of GPT models do not always equate to improved performance. Notably, the GPT-4 version from March demonstrates higher factual accuracy than its subsequent June release. Furthermore, a concerning bias is observed, privileging statements from the Global North over the Global South, thus potentially exacerbating existing informational inequities. Regions such as Africa and the Middle East are at a disadvantage, with much lower factual accuracy. The performance fluctuations over time suggest that model updates may not consistently benefit all regions equally. Our study also offers insights into the impact of various LLM configuration settings, such as binary decision forcing, model re-runs and temperature, on model's factuality. Models constrained to binary (true/false) choices exhibit reduced factuality compared to those allowing an 'unclear' option. Single inference at a low temperature setting matches the reliability of majority voting across various configurations. The insights gained highlight the need for culturally diverse and geographically inclusive model training and evaluation. This approach is key to achieving global equity in technology, distributing AI benefits fairly worldwide.

CLMar 28, 2025
Understanding Inequality of LLM Fact-Checking over Geographic Regions with Agent and Retrieval models

Bruno Coelho, Shujaat Mirza, Yuyuan Cui et al.

Fact-checking is a potentially useful application of Large Language Models (LLMs) to combat the growing dissemination of disinformation. However, the performance of LLMs varies across geographic regions. In this paper, we evaluate the factual accuracy of open and private models across a diverse set of regions and scenarios. Using a dataset containing 600 fact-checked statements balanced across six global regions we examine three experimental setups of fact-checking a statement: (1) when just the statement is available, (2) when an LLM-based agent with Wikipedia access is utilized, and (3) as a best case scenario when a Retrieval-Augmented Generation (RAG) system provided with the official fact check is employed. Our findings reveal that regardless of the scenario and LLM used, including GPT-4, Claude Sonnet, and LLaMA, statements from the Global North perform substantially better than those from the Global South. Furthermore, this gap is broadened for the more realistic case of a Wikipedia agent-based system, highlighting that overly general knowledge bases have a limited ability to address region-specific nuances. These results underscore the urgent need for better dataset balancing and robust retrieval strategies to enhance LLM fact-checking capabilities, particularly in geographically diverse contexts.

CRMay 28, 2020
The Tools and Tactics Used in Intimate Partner Surveillance: An Analysis of Online Infidelity Forums

Emily Tseng, Rosanna Bellini, Nora McDonald et al.

Abusers increasingly use spyware apps, account compromise, and social engineering to surveil their intimate partners, causing substantial harms that can culminate in violence. This form of privacy violation, termed intimate partner surveillance (IPS), is a profoundly challenging problem to address due to the physical access and trust present in the relationship between the target and attacker. While previous research has examined IPS from the perspectives of survivors, we present the first measurement study of online forums in which (potential) attackers discuss IPS strategies and techniques. In domains such as cybercrime, child abuse, and human trafficking, studying the online behaviors of perpetrators has led to better threat intelligence and techniques to combat attacks. We aim to provide similar insights in the context of IPS. We identified five online forums containing discussion of monitoring cellphones and other means of surveilling an intimate partner, including three within the context of investigating relationship infidelity. We perform a mixed-methods analysis of these forums, surfacing the tools and tactics that attackers use to perform surveillance. Via qualitative analysis of forum content, we present a taxonomy of IPS strategies used and recommended by attackers, and synthesize lessons for technologists seeking to curb the spread of IPS.

CRDec 2, 2018
Towards Automatic Discovery of Cybercrime Supply Chains

Rasika Bhalerao, Maxwell Aliapoulios, Ilia Shumailov et al.

Cybercrime forums enable modern criminal entrepreneurs to collaborate with other criminals into increasingly efficient and sophisticated criminal endeavors. Understanding the connections between different products and services can often illuminate effective interventions. However, generating this understanding of supply chains currently requires time-consuming manual effort. In this paper, we propose a language-agnostic method to automatically extract supply chains from cybercrime forum posts and replies. Our supply chain detection algorithm can identify 36% and 58% relevant chains within major English and Russian forums, respectively, showing improvements over the baselines of 13% and 36%, respectively. Our analysis of the automatically generated supply chains demonstrates underlying connections between products and services within these forums. For example, the extracted supply chain illuminated the connection between hack-for-hire services and the selling of rare and valuable `OG' accounts, which has only recently been reported. The understanding of connections between products and services exposes potentially effective intervention points.

CRMay 11, 2018
Under the Underground: Predicting Private Interactions in Underground Forums

Rebekah Overdorf, Carmela Troncoso, Rachel Greenstadt et al.

Underground forums where users discuss, buy, and sell illicit services and goods facilitate a better understanding of the economy and organization of cybercriminals. Prior work has shown that in particular private interactions provide a wealth of information about the cybercriminal ecosystem. Yet, those messages are seldom available to analysts, except when there is a leak. To address this problem we propose a supervised machine learning based method able to predict which public \threads will generate private messages, after a partial leak of such messages has occurred. To the best of our knowledge, we are the first to develop a solution to overcome the barrier posed by limited to no information on private activity for underground forum analysis. Additionally, we propose an automate method for labeling posts, significantly reducing the cost of our approach in the presence of real unlabeled data. This method can be tuned to focus on the likelihood of users receiving private messages, or \threads triggering private interactions. We evaluate the performance of our methods using data from three real forum leaks. Our results show that public information can indeed be used to predict private activity, although prediction models do not transfer well between forums. We also find that neither the length of the leak period nor the time between the leak and the prediction have significant impact on our technique's performance, and that NLP features dominate the prediction power.

CLAug 31, 2017
Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation

Greg Durrett, Jonathan K. Kummerfeld, Taylor Berg-Kirkpatrick et al.

One weakness of machine-learned NLP models is that they typically perform poorly on out-of-domain data. In this work, we study the task of identifying products being bought and sold in online cybercrime forums, which exhibits particularly challenging cross-domain effects. We formulate a task that represents a hybrid of slot-filling information extraction and named entity recognition and annotate data from four different forums. Each of these forums constitutes its own "fine-grained domain" in that the forums cover different market sectors with different properties, even though all forums are in the broad domain of cybercrime. We characterize these domain differences in the context of a learning-based system: supervised models see decreased accuracy when applied to new forums, and standard techniques for semi-supervised learning and domain adaptation have limited effectiveness on this data, which suggests the need to improve these techniques. We release a dataset of 1,938 annotated posts from across the four forums.

CRAug 14, 2015
Stress Testing the Booters: Understanding and Undermining the Business of DDoS Services

Mohammad Karami, Youngsam Park, Damon McCoy

DDoS-for-hire services, also known as booters, have commoditized DDoS attacks and enabled abusive subscribers of these services to cheaply extort, harass and intimidate businesses and people by knocking them offline. However, due to the underground nature of these booters, little is known about their underlying technical and business structure. In this paper we empirically measure many facets of their technical and payment infrastructure. We also perform an analysis of leaked and scraped data from three major booters---Asylum Stresser, Lizard Stresser and VDO---which provides us with an in-depth view of their customers and victims. Finally, we conduct a large-scale payment intervention in collaboration with PayPal and evaluate its effectiveness. Based on our analysis we show that these services are responsible for hundreds of thousands of DDoS attacks and identify potentially promising methods of increasing booters' costs and undermining these services.