Mark Staples

h-index29

6papers

888citations

Novelty20%

AI Score21

Ranked #181,178 of 194,257 authors (top 93%)#2,313 in SE (top 76%)

6 Papers

24.8CYJul 8, 2023

Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions

Dawen Zhang, Pamela Finckenberg-Broman, Thong Hoang et al.

The Right to be Forgotten (RTBF) was first established as the result of the ruling of Google Spain SL, Google Inc. v AEPD, Mario Costeja González, and was later included as the Right to Erasure under the General Data Protection Regulation (GDPR) of European Union to allow individuals the right to request personal data be deleted by organizations. Specifically for search engines, individuals can send requests to organizations to exclude their information from the query results. It was a significant emergent right as the result of the evolution of technology. With the recent development of Large Language Models (LLMs) and their use in chatbots, LLM-enabled software systems have become popular. But they are not excluded from the RTBF. Compared with the indexing approach used by search engines, LLMs store, and process information in a completely different way. This poses new challenges for compliance with the RTBF. In this paper, we explore these challenges and provide our insights on how to implement technical solutions for the RTBF, including the use of differential privacy, machine unlearning, model editing, and guardrails. With the rapid advancement of AI and the increasing need of regulating this powerful technology, learning from the case of RTBF can provide valuable lessons for technical practitioners, legal experts, organizations, and authorities.

15.6SEFeb 7, 2023

To Be Forgotten or To Be Fair: Unveiling Fairness Implications of Machine Unlearning Methods

Dawen Zhang, Shidong Pan, Thong Hoang et al.

The right to be forgotten (RTBF) is motivated by the desire of people not to be perpetually disadvantaged by their past deeds. For this, data deletion needs to be deep and permanent, and should be removed from machine learning models. Researchers have proposed machine unlearning algorithms which aim to erase specific data from trained models more efficiently. However, these methods modify how data is fed into the model and how training is done, which may subsequently compromise AI ethics from the fairness perspective. To help software engineers make responsible decisions when adopting these unlearning methods, we present the first study on machine unlearning methods to reveal their fairness implications. We designed and conducted experiments on two typical machine unlearning methods (SISA and AmnesiacML) along with a retraining method (ORTR) as baseline using three fairness datasets under three different deletion strategies. Experimental results show that under non-uniform data deletion, SISA leads to better fairness compared with ORTR and AmnesiacML, while initial training and uniform data deletion do not necessarily affect the fairness of all three methods. These findings have exposed an important research problem in software engineering, and can help practitioners better understand the potential trade-offs on fairness when considering solutions for RTBF.

8.4SENov 30, 2023

Privacy and Copyright Protection in Generative AI: A Lifecycle Perspective

Dawen Zhang, Boming Xia, Yue Liu et al.

The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential privacy, machine unlearning, and data poisoning only offer fragmented solutions to these complex issues. Our paper delves into the multifaceted challenges of privacy and copyright protection within the data lifecycle. We advocate for integrated approaches that combines technical innovation with ethical foresight, holistically addressing these concerns by investigating and devising solutions that are informed by the lifecycle perspective. This work aims to catalyze a broader discussion and inspire concerted efforts towards data privacy and copyright integrity in Generative AI.

1.2CYJul 19, 2023

Test-takers have a say: understanding the implications of the use of AI in language tests

Dawen Zhang, Thong Hoang, Shidong Pan et al.

Language tests measure a person's ability to use a language in terms of listening, speaking, reading, or writing. Such tests play an integral role in academic, professional, and immigration domains, with entities such as educational institutions, professional accreditation bodies, and governments using them to assess candidate language proficiency. Recent advances in Artificial Intelligence (AI) and the discipline of Natural Language Processing have prompted language test providers to explore AI's potential applicability within language testing, leading to transformative activity patterns surrounding language instruction and learning. However, with concerns over AI's trustworthiness, it is imperative to understand the implications of integrating AI into language testing. This knowledge will enable stakeholders to make well-informed decisions, thus safeguarding community well-being and testing integrity. To understand the concerns and effects of AI usage in language tests, we conducted interviews and surveys with English test-takers. To the best of our knowledge, this is the first empirical study aimed at identifying the implications of AI adoption in language tests from a test-taker perspective. Our study reveals test-taker perceptions and behavioral patterns. Specifically, we identify that AI integration may enhance perceptions of fairness, consistency, and availability. Conversely, it might incite mistrust regarding reliability and interactivity aspects, subsequently influencing the behaviors and well-being of test-takers. These insights provide a better understanding of potential societal implications and assist stakeholders in making informed decisions concerning AI usage in language testing.

14.6SEMay 12, 2021

A Systematic Literature Review on Blockchain Governance

Yue Liu, Qinghua Lu, Liming Zhu et al.

Blockchain has been increasingly used as a software component to enable decentralisation in software architecture for a variety of applications. Blockchain governance has received considerable attention to ensure the safe and appropriate use and evolution of blockchain, especially after the Ethereum DAO attack in 2016. However, there are no systematic efforts to analyse existing governance solutions. To understand the state-of-the-art of blockchain governance, we conducted a systematic literature review with 37 primary studies. The extracted data from primary studies are synthesised to answer identified research questions. The study results reveal several major findings: 1) governance can improve the adaptability and upgradability of blockchain, whilst the current studies neglect broader ethical responsibilities as the objectives of blockchain governance; 2) governance is along with the development process of a blockchain platform, while ecosystem-level governance process is missing, and; 3) the responsibilities and capabilities of blockchain stakeholders are briefly discussed, whilst the decision rights, accountability, and incentives of blockchain stakeholders are still under studied. We provide actionable guidelines for academia and practitioners to use throughout the lifecycle of blockchain, and identify future trends to support researchers in this area.

20.6SEApr 12, 2017

Blockchains for Business Process Management - Challenges and Opportunities

Jan Mendling, Ingo Weber, Wil van der Aalst et al.

Blockchain technology promises a sizable potential for executing inter-organizational business processes without requiring a central party serving as a single point of trust (and failure). This paper analyzes its impact on business process management (BPM). We structure the discussion using two BPM frameworks, namely the six BPM core capabilities and the BPM lifecycle. This paper provides research directions for investigating the application of blockchain technology to BPM.