CRSep 12, 2023
Verifiable Fairness: Privacy-preserving Computation of Fairness for Machine Learning SystemsEhsan Toreini, Maryam Mehrnezhad, Aad van Moorsel
Fair machine learning is a thriving and vibrant research topic. In this paper, we propose Fairness as a Service (FaaS), a secure, verifiable and privacy-preserving protocol to computes and verify the fairness of any machine learning (ML) model. In the deisgn of FaaS, the data and outcomes are represented through cryptograms to ensure privacy. Also, zero knowledge proofs guarantee the well-formedness of the cryptograms and underlying data. FaaS is model--agnostic and can support various fairness metrics; hence, it can be used as a service to audit the fairness of any ML model. Our solution requires no trusted third party or private channels for the computation of the fairness metric. The security guarantees and commitments are implemented in a way that every step is securely transparent and verifiable from the start to the end of the process. The cryptograms of all input data are publicly available for everyone, e.g., auditors, social activists and experts, to verify the correctness of the process. We implemented FaaS to investigate performance and demonstrate the successful use of FaaS for a publicly available data set with thousands of entries.
CRAug 2, 2023
A Practical Deep Learning-Based Acoustic Side Channel Attack on KeyboardsJoshua Harrison, Ehsan Toreini, Maryam Mehrnezhad
With recent developments in deep learning, the ubiquity of micro-phones and the rise in online services via personal devices, acoustic side channel attacks present a greater threat to keyboards than ever. This paper presents a practical implementation of a state-of-the-art deep learning model in order to classify laptop keystrokes, using a smartphone integrated microphone. When trained on keystrokes recorded by a nearby phone, the classifier achieved an accuracy of 95%, the highest accuracy seen without the use of a language model. When trained on keystrokes recorded using the video-conferencing software Zoom, an accuracy of 93% was achieved, a new best for the medium. Our results prove the practicality of these side channel attacks via off-the-shelf equipment and algorithms. We discuss a series of mitigation methods to protect users against these series of attacks.
CLMar 30, 2025
Measuring Online Hate on 4chan using Pre-trained Deep Learning ModelsAdrian Bermudez-Villalva, Maryam Mehrnezhad, Ehsan Toreini
Online hate speech can harmfully impact individuals and groups, specifically on non-moderated platforms such as 4chan where users can post anonymous content. This work focuses on analysing and measuring the prevalence of online hate on 4chan's politically incorrect board (/pol/) using state-of-the-art Natural Language Processing (NLP) models, specifically transformer-based models such as RoBERTa and Detoxify. By leveraging these advanced models, we provide an in-depth analysis of hate speech dynamics and quantify the extent of online hate non-moderated platforms. The study advances understanding through multi-class classification of hate speech (racism, sexism, religion, etc.), while also incorporating the classification of toxic content (e.g., identity attacks and threats) and a further topic modelling analysis. The results show that 11.20% of this dataset is identified as containing hate in different categories. These evaluations show that online hate is manifested in various forms, confirming the complicated and volatile nature of detection in the wild.
HCFeb 9, 2022
"I feel invaded, annoyed, anxious and I may protect myself": Individuals' Feelings about Online Tracking and their Protective Behaviour across Gender and CountryKovila P. L. Coopamootoo, Maryam Mehrnezhad, Ehsan Toreini
Online tracking is a primary concern for Internet users, yet previous research has not found a clear link between the cognitive understanding of tracking and protective actions. We postulate that protective behaviour follows affective evaluation of tracking. We conducted an online study, with N=614 participants, across the UK, Germany and France, to investigate how users feel about third-party tracking and what protective actions they take. We found that most participants' feelings about tracking were negative, described as deeply intrusive - beyond the informational sphere, including feelings of annoyance and anxiety, that predict protective actions. We also observed indications of a `privacy gender gap', where women feel more negatively about tracking, yet are less likely to take protective actions, compared to men. And less UK individuals report negative feelings and protective actions, compared to those from Germany and France. This paper contributes insights into the affective evaluation of privacy threats and how it predicts protective behaviour. It also provides a discussion on the implications of these findings for various stakeholders, make recommendations and outline avenues for future work.
CRMar 10, 2021
Anti-Counterfeiting for Polymer Banknotes Based on Polymer Substrate FingerprintingShen Wang, Ehsan Toreini, Feng Hao
Polymer banknotes are the trend for printed currency and have been adopted by more than fifty countries worldwide. However, over the past years, the quantity of polymer counterfeits has been increasing, so has the quality of counterfeits. This shows that the initial advantage of bringing a new polymer technology to fight against counterfeiting is reducing. To maintain one step ahead of counterfeiters, we propose a novel anti-counterfeiting technique called Polymer Substrate Fingerprinting (PSF). Our technique is built based on the observation that the opacity coating, a critical step during the production of polymer notes, is a stochastic manufacturing process, leaving uneven thickness in the coating layer and the random dispersion of impurities from the ink. The imperfections in the coating layer result in random translucent patterns when a polymer banknote is back-lit by a light source. We show these patterns can be reliably captured by a commodity negative-film scanner and processed into a compact fingerprint to uniquely identify each banknote. Using an extensive dataset of 6,200 sample images collected from 340 UK banknotes, we show that our method can reliably authenticate banknotes, and is robust against rough daily handling of banknotes. Furthermore, we show the extracted fingerprints contain around 900 bits of entropy, which makes it extremely scalable to identify every polymer note circulated globally. As compared with previous or existing anti-counterfeiting mechanisms for banknotes, our method has a distinctive advantage: it ensures that even in the extreme case when counterfeiters have procured the same printing equipment and ink as used by a legitimate government, counterfeiting banknotes remains infeasible because of the difficulty to replicate a stochastic manufacturing process.
LGJul 17, 2020
Technologies for Trustworthy Machine Learning: A Survey in a Socio-Technical ContextEhsan Toreini, Mhairi Aitken, Kovila P. L. Coopamootoo et al.
Concerns about the societal impact of AI-based services and systems has encouraged governments and other organisations around the world to propose AI policy frameworks to address fairness, accountability, transparency and related topics. To achieve the objectives of these frameworks, the data and software engineers who build machine-learning systems require knowledge about a variety of relevant supporting tools and techniques. In this paper we provide an overview of technologies that support building trustworthy machine learning systems, i.e., systems whose properties justify that people place trust in them. We argue that four categories of system properties are instrumental in achieving the policy objectives, namely fairness, explainability, auditability and safety & security (FEAS). We discuss how these properties need to be considered across all stages of the machine learning life cycle, from data collection through run-time model inference. As a consequence, we survey in this paper the main technologies with respect to all four of the FEAS properties, for data-centric as well as model-centric stages of the machine learning system life cycle. We conclude with an identification of open research problems, with a particular focus on the connection between trustworthy machine learning technologies and their implications for individuals and society.
HCJun 27, 2020
Simulating the Effects of Social Presence on Trust, Privacy Concerns & Usage Intentions in Automated Bots for FinanceMagdalene Ng, Kovila P. L. Coopamootoo, Ehsan Toreini et al.
FinBots are chatbots built on automated decision technology, aimed to facilitate accessible banking and to support customers in making financial decisions. Chatbots are increasing in prevalence, sometimes even equipped to mimic human social rules, expectations and norms, decreasing the necessity for human-to-human interaction. As banks and financial advisory platforms move towards creating bots that enhance the current state of consumer trust and adoption rates, we investigated the effects of chatbot vignettes with and without socio-emotional features on intention to use the chatbot for financial support purposes. We conducted a between-subject online experiment with N = 410 participants. Participants in the control group were provided with a vignette describing a secure and reliable chatbot called XRO23, whereas participants in the experimental group were presented with a vignette describing a secure and reliable chatbot that is more human-like and named Emma. We found that Vignette Emma did not increase participants' trust levels nor lowered their privacy concerns even though it increased perception of social presence. However, we found that intention to use the presented chatbot for financial support was positively influenced by perceived humanness and trust in the bot. Participants were also more willing to share financially-sensitive information such as account number, sort code and payments information to XRO23 compared to Emma - revealing a preference for a technical and mechanical FinBot in information sharing. Overall, this research contributes to our understanding of the intention to use chatbots with different features as financial technology, in particular that socio-emotional support may not be favoured when designed independently of financial function.
CYNov 27, 2019
The relationship between trust in AI and trustworthy machine learning technologiesEhsan Toreini, Mhairi Aitken, Kovila Coopamootoo et al.
To build AI-based systems that users and the public can justifiably trust one needs to understand how machine learning technologies impact trust put in these services. To guide technology developments, this paper provides a systematic approach to relate social science concepts of trust with the technologies used in AI-based services and products. We conceive trust as discussed in the ABI (Ability, Benevolence, Integrity) framework and use a recently proposed mapping of ABI on qualities of technologies. We consider four categories of machine learning technologies, namely these for Fairness, Explainability, Auditability and Safety (FEAS) and discuss if and how these possess the required qualities. Trust can be impacted throughout the life cycle of AI-based systems, and we introduce the concept of Chain of Trust to discuss technological needs for trust in different stages of the life cycle. FEAS has obvious relations with known frameworks and therefore we relate FEAS to a variety of international Principled AI policy and technology frameworks that have emerged in recent years.
CRMay 30, 2019
DOMtegrity: Ensuring Web Page Integrity against Malicious Browser ExtensionsEhsan Toreini, Maryam Mehrnezhad, Siamak F. Shahandashti et al.
In this paper, we address an unsolved problem in the real world: how to ensure the integrity of the web content in a browser in the presence of malicious browser extensions? The problem of exposing confidential user credentials to malicious extensions has been widely understood, which has prompted major banks to deploy two-factor authentication. However, the importance of the `integrity' of the web content has received little attention. We implement two attacks on real-world online banking websites and show that ignoring the `integrity' of the web content can fundamentally defeat two-factor solutions. To address this problem, we propose a cryptographic protocol called DOMtegrity to ensure the end-to-end integrity of the DOM structure of a web page from delivering at a web server to the rendering of the page in the user's browser. DOMtegrity is the first solution that protects DOM integrity without modifying the browser architecture or requiring extra hardware. It works by exploiting subtle yet important differences between browser extensions and in-line JavaScript code. We show how DOMtegrity prevents the earlier attacks and a whole range of man-in-the-browser (MITB) attacks. We conduct extensive experiments on more than 14,000 real-world extensions to evaluate the effectiveness of DOMtegrity.
CRMay 6, 2017
Texture to the Rescue: Practical Paper Fingerprinting based on Texture PatternsEhsan Toreini, Siamak F. Shahandashti, Feng Hao
In this paper, we propose a novel paper fingerprinting technique based on analyzing the translucent patterns revealed when a light source shines through the paper. These patterns represent the inherent texture of paper, formed by the random interleaving of wooden particles during the manufacturing process. We show these patterns can be easily captured by a commodity camera and condensed into to a compact 2048-bit fingerprint code. Prominent works in this area (Nature 2005, IEEE S&P 2009, CCS 2011) have all focused on fingerprinting paper based on the paper "surface". We are motivated by the observation that capturing the surface alone misses important distinctive features such as the non-even thickness, the random distribution of impurities, and different materials in the paper with varying opacities. Through experiments, we demonstrate that the embedded paper texture provides a more reliable source for fingerprinting than features on the surface. Based on the collected datasets, we achieve 0% false rejection and 0% false acceptance rates. We further report that our extracted fingerprints contain 807 degrees-of-freedom (DoF), which is much higher than the 249 DoF with iris codes (that have the same size of 2048 bits). The high amount of DoF for texture-based fingerprints makes our method extremely scalable for recognition among very large databases; it also allows secure usage of the extracted fingerprint in privacy-preserving authentication schemes based on error correction techniques.
CRMay 18, 2016
Stealing PINs via Mobile Sensors: Actual Risk versus User PerceptionMaryam Mehrnezhad, Ehsan Toreini, Siamak F. Shahandashti et al.
In this paper, we present the actual risks of stealing user PINs by using mobile sensors versus the perceived risks by users. First, we propose PINlogger.js which is a JavaScript-based side channel attack revealing user PINs on an Android mobile phone. In this attack, once the user visits a website controlled by an attacker, the JavaScript code embedded in the web page starts listening to the motion and orientation sensor streams without needing any permission from the user. By analysing these streams, it infers the user's PIN using an artificial neural network. Based on a test set of fifty 4-digit PINs, PINlogger.js is able to correctly identify PINs in the first attempt with a success rate of 74% which increases to 86 and 94% in the second and third attempts, respectively. The high success rates of stealing user PINs on mobile devices via JavaScript indicate a serious threat to user security. With the technical understanding of the information leakage caused by mobile phone sensors, we then study users' perception of the risks associated with these sensors. We design user studies to measure the general familiarity with different sensors and their functionality, and to investigate how concerned users are about their PIN being discovered by an app that has access to all these sensors. Our studies show that there is significant disparity between the actual and perceived levels of threat with regard to the compromise of the user PIN. We confirm our results by interviewing our participants using two different approaches, within-subject and between-subject, and compare the results. We discuss how this observation, along with other factors, renders many academic and industry solutions ineffective in preventing such side channel attacks.
CRFeb 12, 2016
TouchSignatures: Identification of User Touch Actions and PINs Based on Mobile Sensor Data via JavaScriptMaryam Mehrnezhad, Ehsan Toreini, Siamak F. Shahandashti et al.
Conforming to W3C specifications, mobile web browsers allow JavaScript code in a web page to access motion and orientation sensor data without the user's permission. The associated risks to user security and privacy are however not considered in W3C specifications. In this work, for the first time, we show how user security can be compromised using these sensor data via browser, despite that the data rate is 3 to 5 times slower than what is available in app. We examine multiple popular browsers on Android and iOS platforms and study their policies in granting permissions to JavaScript code with respect to access to motion and orientation sensor data. Based on our observations, we identify multiple vulnerabilities, and propose TouchSignatures which implements an attack where malicious JavaScript code on an attack tab listens to such sensor data measurements. Based on these streams, TouchSignatures is able to distinguish the user's touch actions (i.e., tap, scroll, hold, and zoom) and her PINs, allowing a remote website to learn the client-side user activities. We demonstrate the practicality of this attack by collecting data from real users and reporting high success rates using our proof-of-concept implementations. We also present a set of potential solutions to address the vulnerabilities. The W3C community and major mobile browser vendors including Mozilla, Google, Apple and Opera have acknowledge our work and are implementing some of our proposed countermeasures.