Aad van Moorsel

h-index29

13papers

689citations

Novelty33%

AI Score25

Ranked #164,031 of 194,257 authors (top 84%)#4,671 in CR (top 69%)

13 Papers

9.5CRSep 12, 2023

Verifiable Fairness: Privacy-preserving Computation of Fairness for Machine Learning Systems

Ehsan Toreini, Maryam Mehrnezhad, Aad van Moorsel

Fair machine learning is a thriving and vibrant research topic. In this paper, we propose Fairness as a Service (FaaS), a secure, verifiable and privacy-preserving protocol to computes and verify the fairness of any machine learning (ML) model. In the deisgn of FaaS, the data and outcomes are represented through cryptograms to ensure privacy. Also, zero knowledge proofs guarantee the well-formedness of the cryptograms and underlying data. FaaS is model--agnostic and can support various fairness metrics; hence, it can be used as a service to audit the fairness of any ML model. Our solution requires no trusted third party or private channels for the computation of the fairness metric. The security guarantees and commitments are implemented in a way that every step is securely transparent and verifiable from the start to the end of the process. The cryptograms of all input data are publicly available for everyone, e.g., auditors, social activists and experts, to verify the correctness of the process. We implemented FaaS to investigate performance and demonstrate the successful use of FaaS for a publicly available data set with thousands of entries.

5.3LGFeb 3, 2023

VT-GAN: Cooperative Tabular Data Synthesis using Vertical Federated Learning

Zilong Zhao, Han Wu, Aad Van Moorsel et al.

This paper presents the application of Vertical Federated Learning (VFL) to generate synthetic tabular data using Generative Adversarial Networks (GANs). VFL is a collaborative approach to train machine learning models among distinct tabular data holders, such as financial institutions, who possess disjoint features for the same group of customers. In this paper we introduce the VT-GAN framework, Vertical federated Tabular GAN, and demonstrate that VFL can be successfully used to implement GANs for distributed tabular data in privacy-preserving manner, with performance close to centralized GANs that assume shared data. We make design choices with respect to the distribution of GAN generator and discriminator models and introduce a training-with-shuffling technique so that no party can reconstruct training data from the GAN conditional vector. The paper presents (1) an implementation of VT-GAN, (2) a detailed quality evaluation of the VT-GAN-generated synthetic data, (3) an overall scalability examination of VT-GAN framework, (4) a security analysis on VT-GAN's robustness against Membership Inference Attack with different settings of Differential Privacy, for a range of datasets with diverse distribution characteristics. Our results demonstrate that VT-GAN can consistently generate high-fidelity synthetic tabular data of comparable quality to that generated by a centralized GAN algorithm. The difference in machine learning utility can be as low as 2.7%, even under extremely imbalanced data distributions across clients or with different numbers of clients.

1.2CYDec 17, 2021Code

Know Your Customer: Balancing Innovation and Regulation for Financial Inclusion

Karen Elliott, Kovila Coopamootoo, Edward Curran et al.

Financial inclusion depends on providing adjusted services for citizens with disclosed vulnerabilities. At the same time, the financial industry needs to adhere to a strict regulatory framework, which is often in conflict with the desire for inclusive, adaptive, and privacy-preserving services. In this article we study how this tension impacts the deployment of privacy-sensitive technologies aimed at financial inclusion. We conduct a qualitative study with banking experts to understand their perspectives on service development for financial inclusion. We build and demonstrate a prototype solution based on open source decentralized identifiers and verifiable credentials software and report on feedback from the banking experts on this system. The technology is promising thanks to its selective disclosure of vulnerabilities to the full control of the individual. This supports GDPR requirements, but at the same time, there is a clear tension between introducing these technologies and fulfilling other regulatory requirements, particularly with respect to 'Know Your Customer.' We consider the policy implications stemming from these tensions and provide guidelines for the further design of related technologies.

2.3CYJun 10, 2021

Identifying and Supporting Financially Vulnerable Consumers in a Privacy-Preserving Manner: A Use Case Using Decentralised Identifiers and Verifiable Credentials

Tasos Spiliotopoulos, Dave Horsfall, Magdalene Ng et al.

Vulnerable individuals have a limited ability to make reasonable financial decisions and choices and, thus, the level of care that is appropriate to be provided to them by financial institutions may be different from that required for other consumers. Therefore, identifying vulnerability is of central importance for the design and effective provision of financial services and products. However, validating the information that customers share and respecting their privacy are both particularly important in finance and this poses a challenge for identifying and caring for vulnerable populations. This position paper examines the potential of the combination of two emerging technologies, Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), for the identification of vulnerable consumers in finance in an efficient and privacy-preserving manner.

3.8CRMar 18, 2021

Stochastic Simulation Techniques for Inference and Sensitivity Analysis of Bayesian Attack Graphs

Isaac Matthews, Sadegh Soudjani, Aad van Moorsel

A vulnerability scan combined with information about a computer network can be used to create an attack graph, a model of how the elements of a network could be used in an attack to reach specific states or goals in the network. These graphs can be understood probabilistically by turning them into Bayesian attack graphs, making it possible to quantitatively analyse the security of large networks. In the event of an attack, probabilities on the graph change depending on the evidence discovered (e.g., by an intrusion detection system or knowledge of a host's activity). Since such scenarios are difficult to solve through direct computation, we discuss and compare three stochastic simulation techniques for updating the probabilities dynamically based on the evidence and compare their speed and accuracy. From our experiments we conclude that likelihood weighting is most efficient for most uses. We also consider sensitivity analysis of BAGs, to identify the most critical nodes for protection of the network and solve the uncertainty problem in the assignment of priors to nodes. Since sensitivity analysis can easily become computationally expensive, we present and demonstrate an efficient sensitivity analysis approach that exploits a quantitative relation with stochastic inference.

2.9CRSep 25, 2020

Investigation of 3-D Secure's Model for Fraud Detection

Mohammed Aamir Ali, Thomas Groß, Aad van Moorsel

Background. 3-D Secure 2.0 (3DS 2.0) is an identity federation protocol authenticating the payment initiator for credit card transactions on the Web. Aim. We aim to quantify the impact of factors used by 3DS 2.0 in its fraud-detection decision making process. Method. We ran credit card transactions with two Web sites systematically manipulating the nominal IVs \textsf{machine\_data}, \textsf{value}, \textsf{region}, and \textsf{website}. We measured whether the user was \textsf{challenged} with an authentication, whether the transaction was \textsf{declined}, and whether the card was \textsf{blocked} as nominal DVs. Results. While \textsf{website} and \textsf{card} largely did not show a significant impact on any outcome, \textsf{machine\_data}, \textsf{value} and \textsf{region} did. A change in \textsf{machine\_data}, \textsf{region} or \textsf{value} made it 5-7 times as likely to be challenged with password authentication. However, even in a foreign region with another factor being changed, the overall likelihood of being challenged only reached $60\%$. When in the card's home region, a transaction will be rarely declined ($< 5\%$ in control, $40\%$ with one factor changed). However, in a region foreign to the card the system will more likely decline transactions anyway (about $60\%$) and any change in \textsf{machine\_data} or \textsf{value} will lead to a near-certain declined transaction. The \textsf{region} was the only significant predictor for a card being blocked ($\mathsf{OR}=3$). Conclusions. We found that the decisions to challenge the user with a password authentication, to decline a transaction and to block a card are governed by different weightings. 3DS 2.0 is most likely to decline transactions, especially in a foreign region. It is less likely to challenge users with password authentication, even if \textsf{machine\_data} or \textsf{value} are changed.

7.2LGJul 17, 2020

Technologies for Trustworthy Machine Learning: A Survey in a Socio-Technical Context

Ehsan Toreini, Mhairi Aitken, Kovila P. L. Coopamootoo et al.

Concerns about the societal impact of AI-based services and systems has encouraged governments and other organisations around the world to propose AI policy frameworks to address fairness, accountability, transparency and related topics. To achieve the objectives of these frameworks, the data and software engineers who build machine-learning systems require knowledge about a variety of relevant supporting tools and techniques. In this paper we provide an overview of technologies that support building trustworthy machine learning systems, i.e., systems whose properties justify that people place trust in them. We argue that four categories of system properties are instrumental in achieving the policy objectives, namely fairness, explainability, auditability and safety & security (FEAS). We discuss how these properties need to be considered across all stages of the machine learning life cycle, from data collection through run-time model inference. As a consequence, we survey in this paper the main technologies with respect to all four of the FEAS properties, for data-centric as well as model-centric stages of the machine learning system life cycle. We conclude with an identification of open research problems, with a particular focus on the connection between trustworthy machine learning technologies and their implications for individuals and society.

9.6HCJun 27, 2020

Simulating the Effects of Social Presence on Trust, Privacy Concerns & Usage Intentions in Automated Bots for Finance

Magdalene Ng, Kovila P. L. Coopamootoo, Ehsan Toreini et al.

FinBots are chatbots built on automated decision technology, aimed to facilitate accessible banking and to support customers in making financial decisions. Chatbots are increasing in prevalence, sometimes even equipped to mimic human social rules, expectations and norms, decreasing the necessity for human-to-human interaction. As banks and financial advisory platforms move towards creating bots that enhance the current state of consumer trust and adoption rates, we investigated the effects of chatbot vignettes with and without socio-emotional features on intention to use the chatbot for financial support purposes. We conducted a between-subject online experiment with N = 410 participants. Participants in the control group were provided with a vignette describing a secure and reliable chatbot called XRO23, whereas participants in the experimental group were presented with a vignette describing a secure and reliable chatbot that is more human-like and named Emma. We found that Vignette Emma did not increase participants' trust levels nor lowered their privacy concerns even though it increased perception of social presence. However, we found that intention to use the presented chatbot for financial support was positively influenced by perceived humanness and trust in the bot. Participants were also more willing to share financially-sensitive information such as account number, sort code and payments information to XRO23 compared to Emma - revealing a preference for a technical and mechanical FinBot in information sharing. Overall, this research contributes to our understanding of the intention to use chatbots with different features as financial technology, in particular that socio-emotional support may not be favoured when designed independently of financial function.

7.2CRMay 13, 2020

Cyclic Bayesian Attack Graphs: A Systematic Computational Approach

Isaac Matthews, John Mace, Sadegh Soudjani et al.

Attack graphs are commonly used to analyse the security of medium-sized to large networks. Based on a scan of the network and likelihood information of vulnerabilities, attack graphs can be transformed into Bayesian Attack Graphs (BAGs). These BAGs are used to evaluate how security controls affect a network and how changes in topology affect security. A challenge with these automatically generated BAGs is that cycles arise naturally, which make it impossible to use Bayesian network theory to calculate state probabilities. In this paper we provide a systematic approach to analyse and perform computations over cyclic Bayesian attack graphs. %thus providing a generic approach to handle cycles as well as unifying the theory of Bayesian attack graphs. Our approach first formally introduces two commonly used versions of Bayesian attack graphs and compares their expressiveness. We then present an interpretation of Bayesian attack graphs based on combinational logic circuits, which facilitates an intuitively attractive systematic treatment of cycles. We prove properties of the associated logic circuit and present an algorithm that computes state probabilities without altering the attack graphs (e.g., remove an arc to remove a cycle). Moreover, our algorithm deals seamlessly with all cycles without the need to identify their types. A set of experiments using synthetically created networks demonstrates the scalability of the algorithm on computer networks with hundreds of machines, each with multiple vulnerabilities.

11.5CRApr 28, 2020Code

BlockSim: An Extensible Simulation Tool for Blockchain Systems

Maher Alharby, Aad van Moorsel

Both in the design and deployment of blockchain solutions many performance-impacting configuration choices need to be made. We introduce BlockSim, a framework and software tool to build and simulate discrete-event dynamic systems models for blockchain systems. BlockSim is designed to support the analysis of a large variety of blockchains and blockchain deployments as well as a wide set of analysis questions. At the core of BlockSim is a Base Model, which contains the main model constructs common across various blockchain systems organized in three abstraction layers (network, consensus and incentives layer). The Base Model is usable for a wide variety of blockchain systems and can be extended easily to include system or deployment particulars. The BlockSim software tool provides a simulator that implements the Base Model in Python. This paper describes the Base Model, the simulator implementation, and the application of BlockSim to Bitcoin, Ethereum and other consensus algorithms. We validate BlockSim simulation results by comparison with performance results from actual systems and from other studies in the literature. We close the paper by a BlockSim simulation study of the impact of uncle blocks rewards on mining decentralization, for a variety of blockchain configurations.

5.2CRApr 27, 2020

Data-Driven Model-Based Analysis of the Ethereum Verifier's Dilemma

Maher Alharby, Roben Castagna Lunardi, Amjad Aldweesh et al.

In proof-of-work based blockchains such as Ethereum, verification of blocks is an integral part of establishing consensus across nodes. However, in Ethereum, miners do not receive a reward for verifying. This implies that miners face the Verifier's Dilemma: use resources for verification, or use them for the more lucrative mining of new blocks? We provide an extensive analysis of the Verifier's Dilemma, using a data-driven model-based approach that combines closed-form expressions, machine learning techniques and discrete-event simulation. We collect data from over 300,000 smart contracts and experimentally obtain their CPU execution times. Gaussian Mixture Models and Random Forest Regression transform the data into distributions and inputs suitable for the simulator. We show that, indeed, it is often economically rational not to verify. We consider two approaches to mitigate the implications of the Verifier's Dilemma, namely parallelization and active insertion of invalid blocks, both will be shown to be effective.

26.2CYNov 27, 2019

The relationship between trust in AI and trustworthy machine learning technologies

Ehsan Toreini, Mhairi Aitken, Kovila Coopamootoo et al.

To build AI-based systems that users and the public can justifiably trust one needs to understand how machine learning technologies impact trust put in these services. To guide technology developments, this paper provides a systematic approach to relate social science concepts of trust with the technologies used in AI-based services and products. We conceive trust as discussed in the ABI (Ability, Benevolence, Integrity) framework and use a recently proposed mapping of ABI on qualities of technologies. We consider four categories of machine learning technologies, namely these for Fairness, Explainability, Auditability and Safety (FEAS) and discuss if and how these possess the required qualities. Trust can be impacted throughout the life cycle of AI-based systems, and we introduce the concept of Chain of Trust to discuss technological needs for trust in different stages of the life cycle. FEAS has obvious relations with known frameworks and therefore we relate FEAS to a variety of international Principled AI policy and technology frameworks that have emerged in recent years.

22.5CRAug 3, 2017

Betrayal, Distrust, and Rationality: Smart Counter-Collusion Contracts for Verifiable Cloud Computing

Changyu Dong, Yilei Wang, Amjad Aldweesh et al.

Cloud computing has become an irreversible trend. Together comes the pressing need for verifiability, to assure the client the correctness of computation outsourced to the cloud. Existing verifiable computation techniques all have a high overhead, thus if being deployed in the clouds, would render cloud computing more expensive than the on-premises counterpart. To achieve verifiability at a reasonable cost, we leverage game theory and propose a smart contract based solution. In a nutshell, a client lets two clouds compute the same task, and uses smart contracts to stimulate tension, betrayal and distrust between the clouds, so that rational clouds will not collude and cheat. In the absence of collusion, verification of correctness can be done easily by crosschecking the results from the two clouds. We provide a formal analysis of the games induced by the contracts, and prove that the contracts will be effective under certain reasonable assumptions. By resorting to game theory and smart contracts, we are able to avoid heavy cryptographic protocols. The client only needs to pay two clouds to compute in the clear, and a small transaction fee to use the smart contracts. We also conducted a feasibility study that involves implementing the contracts in Solidity and running them on the official Ethereum network.