CRJul 11, 2024
Automatic Generation of Web Censorship Probe ListsJenny Tang, Leo Alvarez, Arjun Brar et al.
Domain probe lists--used to determine which URLs to probe for Web censorship--play a critical role in Internet censorship measurement studies. Indeed, the size and accuracy of the domain probe list limits the set of censored pages that can be detected; inaccurate lists can lead to an incomplete view of the censorship landscape or biased results. Previous efforts to generate domain probe lists have been mostly manual or crowdsourced. This approach is time-consuming, prone to errors, and does not scale well to the ever-changing censorship landscape. In this paper, we explore methods for automatically generating probe lists that are both comprehensive and up-to-date for Web censorship measurement. We start from an initial set of 139,957 unique URLs from various existing test lists consisting of pages from a variety of languages to generate new candidate pages. By analyzing content from these URLs (i.e., performing topic and keyword extraction), expanding these topics, and using them as a feed to search engines, our method produces 119,255 new URLs across 35,147 domains. We then test the new candidate pages by attempting to access each URL from servers in eleven different global locations over a span of four months to check for their connectivity and potential signs of censorship. Our measurements reveal that our method discovered over 1,400 domains--not present in the original dataset--we suspect to be blocked. In short, automatically updating probe lists is possible, and can help further automate censorship measurements at scale.
CYApr 13
A Cross-Country Evaluation of Sentiment Toward Digital Payment Systems in AfricaIsabel Agadagba, Triphonia Kilasara, Takudzwa Tarutira et al.
Digital payment systems have become a cornerstone of consumer finance in Africa. Prominent payment categories include money transfer applications, mobile money, cryptocurrencies, stablecoins, and central bank digital currencies (CBDCs). While there are studies exploring how and why people use individual digital payment systems (both in Africa and beyond), we lack a good understanding of why people choose between different categories of payment systems, and how they view the tradeoffs between different categories. We conducted qualitative interviews in three African countries -- Nigeria, Tanzania, and Zimbabwe -- to understand how and why people use various payment systems, and what influenced them to start using these systems. Our study highlights several notable findings regarding tradeoffs between perceived utility, privacy, and security. For example, many users trust government issuers to protect them from scams, but they do not trust those same institutions to build reliable systems and products or prioritize customer satisfaction. We also find that most users have accounts on multiple payment systems, and conduct a complex selection process using different platforms for different types of payments. This selection process is driven in part by financial considerations, but also by security, privacy, and trust preferences. Our findings suggest compelling directions for regulators and the research community to design systems that balance users' trust and utility needs.
CRMar 25
A Large-Scale Study of Telegram BotsTaro Tsuchiya, Haoxiang Yu, Tina Marjanov et al.
Telegram, initially a messaging app, has evolved into a platform where users can interact with various services through programmable applications, bots. Bots provide a wide range of uses, from moderating groups, helping with online shopping, to even executing trades in financial markets. However, Telegram has been increasingly associated with various illicit activities -- financial scams, stolen data, non-consensual image sharing, among others, raising concerns bots may be facilitating these operations. This paper is the first to characterize Telegram bots at scale, through the following contributions. First, we offer the largest general-purpose message dataset and the first bot dataset. Through snowball sampling from two published datasets, we uncover over 67,000 additional channels, 492 million messages, and 32,000 bots. Second, we develop a system to automatically interact with bots in order to extract their functionality. Third, based on their description, chat responses, and the associated channels, we classify bots into several domains. Fourth, we investigate the communities each bot serves, by analyzing supported languages, usage patterns (e.g., duration, reuse), and network topology. While our analysis discovers useful applications such as crowdsourcing, we also identify malicious bots (e.g., used for financial scams, illicit underground services) serving as payment gateways, referral systems, and malicious AI endpoints. By exhorting the research community to look at bots as software infrastructure, this work hopes to foster further research useful to content moderators, and to help interventions against illicit activities.
HCNov 16, 2021
Exploring Usable Security to Improve the Impact of Formal Verification: A Research AgendaCarolina Carreira, João F. Ferreira, Alexandra Mendes et al.
As software becomes more complex and assumes an even greater role in our lives, formal verification is set to become the gold standard in securing software systems into the future, since it can guarantee the absence of errors and entire classes of attack. Recent advances in formal verification are being used to secure everything from unmanned drones to the internet. At the same time, the usable security research community has made huge progress in improving the usability of security products and end-users comprehension of security issues. However, there have been no human-centered studies focused on the impact of formal verification on the use and adoption of formally verified software products. We propose a research agenda to fill this gap and to contribute with the first collection of studies on people's mental models on formal verification and associated security and privacy guarantees and threats. The proposed research has the potential to increase the adoption of more secure products and it can be directly used by the security and formal methods communities to create more effective and secure software tools.
CLMar 31, 2021
Self-Supervised Euphemism Detection and Identification for Content ModerationWanzheng Zhu, Hongyu Gong, Rohan Bansal et al.
Fringe groups and organizations have a long history of using euphemisms--ordinary-sounding words with a secret meaning--to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a "ban list", but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider "pot" (storage container or marijuana?) or "heater" (household appliance or firearm?) The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind. This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30-400% higher detection accuracies of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators.
CRJul 9, 2019
ICLab: A Global, Longitudinal Internet Censorship Measurement PlatformArian Akhavan Niaki, Shinyoung Cho, Zachary Weinberg et al.
Researchers have studied Internet censorship for nearly as long as attempts to censor contents have taken place. Most studies have however been limited to a short period of time and/or a few countries; the few exceptions have traded off detail for breadth of coverage. Collecting enough data for a comprehensive, global, longitudinal perspective remains challenging. In this work, we present ICLab, an Internet measurement platform specialized for censorship research. It achieves a new balance between breadth of coverage and detail of measurements, by using commercial VPNs as vantage points distributed around the world. ICLab has been operated continuously since late 2016. It can currently detect DNS manipulation and TCP packet injection, and overt "block pages" however they are delivered. ICLab records and archives raw observations in detail, making retrospective analysis with new techniques possible. At every stage of processing, ICLab seeks to minimize false positives and manual validation. Within 53,906,532 measurements of individual web pages, collected by ICLab in 2017 and 2018, we observe blocking of 3,602 unique URLs in 60 countries. Using this data, we compare how different blocking techniques are deployed in different regions and/or against different types of content. Our longitudinal monitoring pinpoints changes in censorship in India and Turkey concurrent with political shifts, and our clustering techniques discover 48 previously unknown block pages. ICLab's broad and detailed measurements also expose other forms of network interference, such as surveillance and malware injection.
CRApr 13, 2017
An Empirical Analysis of Traceability in the Monero BlockchainMalte Möser, Kyle Soska, Ethan Heilman et al.
Monero is a privacy-centric cryptocurrency that allows users to obscure their transactions by including chaff coins, called "mixins," along with the actual coins they spend. In this paper, we empirically evaluate two weaknesses in Monero's mixin sampling strategy. First, about 62% of transaction inputs with one or more mixins are vulnerable to "chain-reaction" analysis -- that is, the real input can be deduced by elimination. Second, Monero mixins are sampled in such a way that they can be easily distinguished from the real coins by their age distribution; in short, the real input is usually the "newest" input. We estimate that this heuristic can be used to guess the real input with 80% accuracy over all transactions with 1 or more mixins. Next, we turn to the Monero ecosystem and study the importance of mining pools and the former anonymous marketplace AlphaBay on the transaction volume. We find that after removing mining pool activity, there remains a large amount of potentially privacy-sensitive transactions that are affected by these weaknesses. We propose and evaluate two countermeasures that can improve the privacy of future transactions.
CRNov 9, 2016
A Public Comment on NCCoE's White Paper on Privacy-Enhancing Identity BrokersLuís T. A. N. Brandão, Nicolas Christin, George Danezis
The National Cybersecurity Center of Excellence (NCCoE) (in the United States) has published on October 19, 2015, a white paper on "privacy-enhanced identity brokers." We present here a reply to their request for public comments. We enumerate concerns whose consideration we find paramount for the design of a privacy-enhancing identity brokering solution, for identification and authentication of citizens into myriad online services, and we recommend how to incorporate them into a revised white paper. Our observations, focused on privacy, security, auditability and forensics, are mostly based on a recently published research paper (PETS 2015) about two nation-scale brokered identification systems.
GTSep 16, 2014
Audit Games with Multiple Defender ResourcesJeremiah Blocki, Nicolas Christin, Anupam Datta et al.
Modern organizations (e.g., hospitals, social networks, government agencies) rely heavily on audit to detect and punish insiders who inappropriately access and disclose confidential information. Recent work on audit games models the strategic interaction between an auditor with a single audit resource and auditees as a Stackelberg game, augmenting associated well-studied security games with a configurable punishment parameter. We significantly generalize this audit game model to account for multiple audit resources where each resource is restricted to audit a subset of all potential violations, thus enabling application to practical auditing scenarios. We provide an FPTAS that computes an approximately optimal solution to the resulting non-convex optimization problem. The main technical novelty is in the design and correctness proof of an optimization transformation that enables the construction of this FPTAS. In addition, we experimentally demonstrate that this transformation significantly speeds up computation of solutions for a class of audit games and security games.
GTMar 2, 2013
Audit GamesJeremiah Blocki, Nicolas Christin, Anupam Datta et al.
Effective enforcement of laws and policies requires expending resources to prevent and detect offenders, as well as appropriate punishment schemes to deter violators. In particular, enforcement of privacy laws and policies in modern organizations that hold large volumes of personal information (e.g., hospitals, banks, and Web services providers) relies heavily on internal audit mechanisms. We study economic considerations in the design of these mechanisms, focusing in particular on effective resource allocation and appropriate punishment schemes. We present an audit game model that is a natural generalization of a standard security game model for resource allocation with an additional punishment parameter. Computing the Stackelberg equilibrium for this game is challenging because it involves solving an optimization problem with non-convex quadratic constraints. We present an additive FPTAS that efficiently computes a solution that is arbitrarily close to the optimal solution.
CYJul 31, 2012
Traveling the Silk Road: A measurement analysis of a large anonymous online marketplaceNicolas Christin
We perform a comprehensive measurement analysis of Silk Road, an anonymous, international online marketplace that operates as a Tor hidden service and uses Bitcoin as its exchange currency. We gather and analyze data over eight months between the end of 2011 and 2012, including daily crawls of the marketplace for nearly six months in 2012. We obtain a detailed picture of the type of goods being sold on Silk Road, and of the revenues made both by sellers and Silk Road operators. Through examining over 24,400 separate items sold on the site, we show that Silk Road is overwhelmingly used as a market for controlled substances and narcotics, and that most items sold are available for less than three weeks. The majority of sellers disappears within roughly three months of their arrival, but a core of 112 sellers has been present throughout our measurement interval. We evaluate the total revenue made by all sellers, from public listings, to slightly over USD 1.2 million per month; this corresponds to about USD 92,000 per month in commissions for the Silk Road operators. We further show that the marketplace has been operating steadily, with daily sales and number of sellers overall increasing over our measurement interval. We discuss economic and policy implications of our analysis and results, including ethical considerations for future research in this area.