CRSep 12, 2023
Verifiable Fairness: Privacy-preserving Computation of Fairness for Machine Learning SystemsEhsan Toreini, Maryam Mehrnezhad, Aad van Moorsel
Fair machine learning is a thriving and vibrant research topic. In this paper, we propose Fairness as a Service (FaaS), a secure, verifiable and privacy-preserving protocol to computes and verify the fairness of any machine learning (ML) model. In the deisgn of FaaS, the data and outcomes are represented through cryptograms to ensure privacy. Also, zero knowledge proofs guarantee the well-formedness of the cryptograms and underlying data. FaaS is model--agnostic and can support various fairness metrics; hence, it can be used as a service to audit the fairness of any ML model. Our solution requires no trusted third party or private channels for the computation of the fairness metric. The security guarantees and commitments are implemented in a way that every step is securely transparent and verifiable from the start to the end of the process. The cryptograms of all input data are publicly available for everyone, e.g., auditors, social activists and experts, to verify the correctness of the process. We implemented FaaS to investigate performance and demonstrate the successful use of FaaS for a publicly available data set with thousands of entries.
LGFeb 3, 2023
VT-GAN: Cooperative Tabular Data Synthesis using Vertical Federated LearningZilong Zhao, Han Wu, Aad Van Moorsel et al.
This paper presents the application of Vertical Federated Learning (VFL) to generate synthetic tabular data using Generative Adversarial Networks (GANs). VFL is a collaborative approach to train machine learning models among distinct tabular data holders, such as financial institutions, who possess disjoint features for the same group of customers. In this paper we introduce the VT-GAN framework, Vertical federated Tabular GAN, and demonstrate that VFL can be successfully used to implement GANs for distributed tabular data in privacy-preserving manner, with performance close to centralized GANs that assume shared data. We make design choices with respect to the distribution of GAN generator and discriminator models and introduce a training-with-shuffling technique so that no party can reconstruct training data from the GAN conditional vector. The paper presents (1) an implementation of VT-GAN, (2) a detailed quality evaluation of the VT-GAN-generated synthetic data, (3) an overall scalability examination of VT-GAN framework, (4) a security analysis on VT-GAN's robustness against Membership Inference Attack with different settings of Differential Privacy, for a range of datasets with diverse distribution characteristics. Our results demonstrate that VT-GAN can consistently generate high-fidelity synthetic tabular data of comparable quality to that generated by a centralized GAN algorithm. The difference in machine learning utility can be as low as 2.7%, even under extremely imbalanced data distributions across clients or with different numbers of clients.
CVJun 5, 2022
LDRNet: Enabling Real-time Document Localization on Mobile DevicesHan Wu, Holland Qian, Huaming Wu et al.
While Identity Document Verification (IDV) technology on mobile devices becomes ubiquitous in modern business operations, the risk of identity theft and fraud is increasing. The identity document holder is normally required to participate in an online video interview to circumvent impostors. However, the current IDV process depends on an additional human workforce to support online step-by-step guidance which is inefficient and expensive. The performance of existing AI-based approaches cannot meet the real-time and lightweight demands of mobile devices. In this paper, we address those challenges by designing an edge intelligence-assisted approach for real-time IDV. Aiming at improving the responsiveness of the IDV process, we propose a new document localization model for mobile devices, LDRNet, to Localize the identity Document in Real-time. On the basis of a lightweight backbone network, we build three prediction branches for LDRNet, the corner points prediction, the line borders prediction and the document classification. We design novel supplementary targets, the equal-division points, and use a new loss function named Line Loss, to improve the speed and accuracy of our approach. In addition to the IDV process, LDRNet is an efficient and reliable document localization alternative for all kinds of mobile applications. As a matter of proof, we compare the performance of LDRNet with other popular approaches on localizing general document datasets. The experimental results show that LDRNet runs at a speed up to 790 FPS which is 47x faster, while still achieving comparable Jaccard Index(JI) in single-model and single-scale tests.
CYDec 17, 2021Code
Know Your Customer: Balancing Innovation and Regulation for Financial InclusionKaren Elliott, Kovila Coopamootoo, Edward Curran et al.
Financial inclusion depends on providing adjusted services for citizens with disclosed vulnerabilities. At the same time, the financial industry needs to adhere to a strict regulatory framework, which is often in conflict with the desire for inclusive, adaptive, and privacy-preserving services. In this article we study how this tension impacts the deployment of privacy-sensitive technologies aimed at financial inclusion. We conduct a qualitative study with banking experts to understand their perspectives on service development for financial inclusion. We build and demonstrate a prototype solution based on open source decentralized identifiers and verifiable credentials software and report on feedback from the banking experts on this system. The technology is promising thanks to its selective disclosure of vulnerabilities to the full control of the individual. This supports GDPR requirements, but at the same time, there is a clear tension between introducing these technologies and fulfilling other regulatory requirements, particularly with respect to 'Know Your Customer.' We consider the policy implications stemming from these tensions and provide guidelines for the further design of related technologies.
21.3CYMay 6
Trustworthiness in Digital Twin Systems: Systematic Review and Research HorizonsChi Fai David Lam, Aad van Moorsel, Zoya Pourmirza
Digital Twins (DTs) are increasingly deployed across application domains, yet the treatment of trust-related issues remains unevenly addressed. To examine whether and how trust is discussed in the current landscape, we conducted a systematic review of existing DT review papers and a mapping of their abstracts. Seven trust-related challenges and seven trust-enhancing strategies were defined to guide the analysis, enabling the trust focus of each paper to be characterised. By aggregating the challenges and strategies referenced across domains, distinct patterns of emphasis were observed. With certain domains consistently sharing similar spectrum of trust concerns, four integration types, including human-centred, safety-critical, context-specific, and technologically-driven, were identified as emergent categories reflecting how trust is prioritised in different deployment contexts. Drawing on the characteristics of these types, several preliminary directions for future research were proposed. These include the development of trust-by-design principles to inform early-stage decision-making, the inclusion of trust metadata in platform schemas to prompt systematic developer consideration of trust factors, and the exploration of how architectural choices, such as federated DTs, influence user trust.
CYJun 10, 2021
Identifying and Supporting Financially Vulnerable Consumers in a Privacy-Preserving Manner: A Use Case Using Decentralised Identifiers and Verifiable CredentialsTasos Spiliotopoulos, Dave Horsfall, Magdalene Ng et al.
Vulnerable individuals have a limited ability to make reasonable financial decisions and choices and, thus, the level of care that is appropriate to be provided to them by financial institutions may be different from that required for other consumers. Therefore, identifying vulnerability is of central importance for the design and effective provision of financial services and products. However, validating the information that customers share and respecting their privacy are both particularly important in finance and this poses a challenge for identifying and caring for vulnerable populations. This position paper examines the potential of the combination of two emerging technologies, Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), for the identification of vulnerable consumers in finance in an efficient and privacy-preserving manner.
CRMar 18, 2021
Stochastic Simulation Techniques for Inference and Sensitivity Analysis of Bayesian Attack GraphsIsaac Matthews, Sadegh Soudjani, Aad van Moorsel
A vulnerability scan combined with information about a computer network can be used to create an attack graph, a model of how the elements of a network could be used in an attack to reach specific states or goals in the network. These graphs can be understood probabilistically by turning them into Bayesian attack graphs, making it possible to quantitatively analyse the security of large networks. In the event of an attack, probabilities on the graph change depending on the evidence discovered (e.g., by an intrusion detection system or knowledge of a host's activity). Since such scenarios are difficult to solve through direct computation, we discuss and compare three stochastic simulation techniques for updating the probabilities dynamically based on the evidence and compare their speed and accuracy. From our experiments we conclude that likelihood weighting is most efficient for most uses. We also consider sensitivity analysis of BAGs, to identify the most critical nodes for protection of the network and solve the uncertainty problem in the assignment of priors to nodes. Since sensitivity analysis can easily become computationally expensive, we present and demonstrate an efficient sensitivity analysis approach that exploits a quantitative relation with stochastic inference.
CRSep 25, 2020
Investigation of 3-D Secure's Model for Fraud DetectionMohammed Aamir Ali, Thomas Groß, Aad van Moorsel
Background. 3-D Secure 2.0 (3DS 2.0) is an identity federation protocol authenticating the payment initiator for credit card transactions on the Web. Aim. We aim to quantify the impact of factors used by 3DS 2.0 in its fraud-detection decision making process. Method. We ran credit card transactions with two Web sites systematically manipulating the nominal IVs \textsf{machine\_data}, \textsf{value}, \textsf{region}, and \textsf{website}. We measured whether the user was \textsf{challenged} with an authentication, whether the transaction was \textsf{declined}, and whether the card was \textsf{blocked} as nominal DVs. Results. While \textsf{website} and \textsf{card} largely did not show a significant impact on any outcome, \textsf{machine\_data}, \textsf{value} and \textsf{region} did. A change in \textsf{machine\_data}, \textsf{region} or \textsf{value} made it 5-7 times as likely to be challenged with password authentication. However, even in a foreign region with another factor being changed, the overall likelihood of being challenged only reached $60\%$. When in the card's home region, a transaction will be rarely declined ($< 5\%$ in control, $40\%$ with one factor changed). However, in a region foreign to the card the system will more likely decline transactions anyway (about $60\%$) and any change in \textsf{machine\_data} or \textsf{value} will lead to a near-certain declined transaction. The \textsf{region} was the only significant predictor for a card being blocked ($\mathsf{OR}=3$). Conclusions. We found that the decisions to challenge the user with a password authentication, to decline a transaction and to block a card are governed by different weightings. 3DS 2.0 is most likely to decline transactions, especially in a foreign region. It is less likely to challenge users with password authentication, even if \textsf{machine\_data} or \textsf{value} are changed.
LGJul 17, 2020
Technologies for Trustworthy Machine Learning: A Survey in a Socio-Technical ContextEhsan Toreini, Mhairi Aitken, Kovila P. L. Coopamootoo et al.
Concerns about the societal impact of AI-based services and systems has encouraged governments and other organisations around the world to propose AI policy frameworks to address fairness, accountability, transparency and related topics. To achieve the objectives of these frameworks, the data and software engineers who build machine-learning systems require knowledge about a variety of relevant supporting tools and techniques. In this paper we provide an overview of technologies that support building trustworthy machine learning systems, i.e., systems whose properties justify that people place trust in them. We argue that four categories of system properties are instrumental in achieving the policy objectives, namely fairness, explainability, auditability and safety & security (FEAS). We discuss how these properties need to be considered across all stages of the machine learning life cycle, from data collection through run-time model inference. As a consequence, we survey in this paper the main technologies with respect to all four of the FEAS properties, for data-centric as well as model-centric stages of the machine learning system life cycle. We conclude with an identification of open research problems, with a particular focus on the connection between trustworthy machine learning technologies and their implications for individuals and society.
HCJun 27, 2020
Simulating the Effects of Social Presence on Trust, Privacy Concerns & Usage Intentions in Automated Bots for FinanceMagdalene Ng, Kovila P. L. Coopamootoo, Ehsan Toreini et al.
FinBots are chatbots built on automated decision technology, aimed to facilitate accessible banking and to support customers in making financial decisions. Chatbots are increasing in prevalence, sometimes even equipped to mimic human social rules, expectations and norms, decreasing the necessity for human-to-human interaction. As banks and financial advisory platforms move towards creating bots that enhance the current state of consumer trust and adoption rates, we investigated the effects of chatbot vignettes with and without socio-emotional features on intention to use the chatbot for financial support purposes. We conducted a between-subject online experiment with N = 410 participants. Participants in the control group were provided with a vignette describing a secure and reliable chatbot called XRO23, whereas participants in the experimental group were presented with a vignette describing a secure and reliable chatbot that is more human-like and named Emma. We found that Vignette Emma did not increase participants' trust levels nor lowered their privacy concerns even though it increased perception of social presence. However, we found that intention to use the presented chatbot for financial support was positively influenced by perceived humanness and trust in the bot. Participants were also more willing to share financially-sensitive information such as account number, sort code and payments information to XRO23 compared to Emma - revealing a preference for a technical and mechanical FinBot in information sharing. Overall, this research contributes to our understanding of the intention to use chatbots with different features as financial technology, in particular that socio-emotional support may not be favoured when designed independently of financial function.
CRMay 13, 2020
Cyclic Bayesian Attack Graphs: A Systematic Computational ApproachIsaac Matthews, John Mace, Sadegh Soudjani et al.
Attack graphs are commonly used to analyse the security of medium-sized to large networks. Based on a scan of the network and likelihood information of vulnerabilities, attack graphs can be transformed into Bayesian Attack Graphs (BAGs). These BAGs are used to evaluate how security controls affect a network and how changes in topology affect security. A challenge with these automatically generated BAGs is that cycles arise naturally, which make it impossible to use Bayesian network theory to calculate state probabilities. In this paper we provide a systematic approach to analyse and perform computations over cyclic Bayesian attack graphs. %thus providing a generic approach to handle cycles as well as unifying the theory of Bayesian attack graphs. Our approach first formally introduces two commonly used versions of Bayesian attack graphs and compares their expressiveness. We then present an interpretation of Bayesian attack graphs based on combinational logic circuits, which facilitates an intuitively attractive systematic treatment of cycles. We prove properties of the associated logic circuit and present an algorithm that computes state probabilities without altering the attack graphs (e.g., remove an arc to remove a cycle). Moreover, our algorithm deals seamlessly with all cycles without the need to identify their types. A set of experiments using synthetically created networks demonstrates the scalability of the algorithm on computer networks with hundreds of machines, each with multiple vulnerabilities.
CRApr 28, 2020
BlockSim: An Extensible Simulation Tool for Blockchain SystemsMaher Alharby, Aad van Moorsel
Both in the design and deployment of blockchain solutions many performance-impacting configuration choices need to be made. We introduce BlockSim, a framework and software tool to build and simulate discrete-event dynamic systems models for blockchain systems. BlockSim is designed to support the analysis of a large variety of blockchains and blockchain deployments as well as a wide set of analysis questions. At the core of BlockSim is a Base Model, which contains the main model constructs common across various blockchain systems organized in three abstraction layers (network, consensus and incentives layer). The Base Model is usable for a wide variety of blockchain systems and can be extended easily to include system or deployment particulars. The BlockSim software tool provides a simulator that implements the Base Model in Python. This paper describes the Base Model, the simulator implementation, and the application of BlockSim to Bitcoin, Ethereum and other consensus algorithms. We validate BlockSim simulation results by comparison with performance results from actual systems and from other studies in the literature. We close the paper by a BlockSim simulation study of the impact of uncle blocks rewards on mining decentralization, for a variety of blockchain configurations.
CRApr 27, 2020
Data-Driven Model-Based Analysis of the Ethereum Verifier's DilemmaMaher Alharby, Roben Castagna Lunardi, Amjad Aldweesh et al.
In proof-of-work based blockchains such as Ethereum, verification of blocks is an integral part of establishing consensus across nodes. However, in Ethereum, miners do not receive a reward for verifying. This implies that miners face the Verifier's Dilemma: use resources for verification, or use them for the more lucrative mining of new blocks? We provide an extensive analysis of the Verifier's Dilemma, using a data-driven model-based approach that combines closed-form expressions, machine learning techniques and discrete-event simulation. We collect data from over 300,000 smart contracts and experimentally obtain their CPU execution times. Gaussian Mixture Models and Random Forest Regression transform the data into distributions and inputs suitable for the simulator. We show that, indeed, it is often economically rational not to verify. We consider two approaches to mitigate the implications of the Verifier's Dilemma, namely parallelization and active insertion of invalid blocks, both will be shown to be effective.
CYNov 27, 2019
The relationship between trust in AI and trustworthy machine learning technologiesEhsan Toreini, Mhairi Aitken, Kovila Coopamootoo et al.
To build AI-based systems that users and the public can justifiably trust one needs to understand how machine learning technologies impact trust put in these services. To guide technology developments, this paper provides a systematic approach to relate social science concepts of trust with the technologies used in AI-based services and products. We conceive trust as discussed in the ABI (Ability, Benevolence, Integrity) framework and use a recently proposed mapping of ABI on qualities of technologies. We consider four categories of machine learning technologies, namely these for Fairness, Explainability, Auditability and Safety (FEAS) and discuss if and how these possess the required qualities. Trust can be impacted throughout the life cycle of AI-based systems, and we introduce the concept of Chain of Trust to discuss technological needs for trust in different stages of the life cycle. FEAS has obvious relations with known frameworks and therefore we relate FEAS to a variety of international Principled AI policy and technology frameworks that have emerged in recent years.
CROct 17, 2017
Blockchain-based Smart Contracts: A Systematic Mapping StudyMaher Alharby, Aad van Moorsel
An appealing feature of blockchain technology is smart contracts. A smart contract is executable code that runs on top of the blockchain to facilitate, execute and enforce an agreement between untrusted parties without the involvement of a trusted third party. In this paper, we conduct a systematic mapping study to collect all research that is relevant to smart contracts from a technical perspective. The aim of doing so is to identify current research topics and open challenges for future studies in smart contract research. We extract 24 papers from different scientific databases. The results show that about two thirds of the papers focus on identifying and tackling smart contract issues. Four key issues are identified, namely, codifying, security, privacy and performance issues. The rest of the papers focuses on smart contract applications or other smart contract related topics. Research gaps that need to be addressed in future studies are provided.
CRAug 3, 2017
Betrayal, Distrust, and Rationality: Smart Counter-Collusion Contracts for Verifiable Cloud ComputingChangyu Dong, Yilei Wang, Amjad Aldweesh et al.
Cloud computing has become an irreversible trend. Together comes the pressing need for verifiability, to assure the client the correctness of computation outsourced to the cloud. Existing verifiable computation techniques all have a high overhead, thus if being deployed in the clouds, would render cloud computing more expensive than the on-premises counterpart. To achieve verifiability at a reasonable cost, we leverage game theory and propose a smart contract based solution. In a nutshell, a client lets two clouds compute the same task, and uses smart contracts to stimulate tension, betrayal and distrust between the clouds, so that rational clouds will not collude and cheat. In the absence of collusion, verification of correctness can be done easily by crosschecking the results from the two clouds. We provide a formal analysis of the games induced by the contracts, and prove that the contracts will be effective under certain reasonable assumptions. By resorting to game theory and smart contracts, we are able to avoid heavy cryptographic protocols. The client only needs to pay two clouds to compute in the clear, and a small transaction fee to use the smart contracts. We also conducted a feasibility study that involves implementing the contracts in Solidity and running them on the official Ethereum network.