CYApr 20
Informing AI Policy Assessment using Large-Scale Simulation of InterventionsJulia Barnett, Kimon Kieslich, Natali Helberger et al.
As the rapid proliferation of AI systems and harms spurs efforts in AI governance around the world, prioritizing among competing policy options has become increasingly challenging for policymakers and researchers. We introduce a methodology for identifying viable policy options to mitigate specified AI harms, helping policymakers and researchers target areas that warrant greater time and resource investment. This method combines participatory evaluation of policies, expert assessment of implementation costs, and an LLM-based assessment of perceived harm mitigation under each policy option. We leverage a genetic algorithm-based simulation study to explore a vast solution space of potential policy combinations, and examine how outcomes change under different weightings of cost, participatory input, and harm mitigation. We find that this method enables exploration of different balances between participatory and expert components, allowing policymakers and researchers to assess how much weight to assign to each. We argue that the diversity of viable policy combinations found by the genetic algorithm could be a useful starting point for deliberation. This method operationalizes existing work on participatory AI by integrating it directly into practical policy development pipelines.
CLNov 4, 2024Code
Towards Leveraging News Media to Support Impact Assessment of AI TechnologiesMowafak Allaham, Kimon Kieslich, Nicholas Diakopoulos
Expert-driven frameworks for impact assessments (IAs) may inadvertently overlook the effects of AI technologies on the public's social behavior, policy, and the cultural and geographical contexts shaping the perception of AI and the impacts around its use. This research explores the potentials of fine-tuning LLMs on negative impacts of AI reported in a diverse sample of articles from 266 news domains spanning 30 countries around the world to incorporate more diversity into IAs. Our findings highlight (1) the potential of fine-tuned open-source LLMs in supporting IA of AI technologies by generating high-quality negative impacts across four qualitative dimensions: coherence, structure, relevance, and plausibility, and (2) the efficacy of small open-source LLM (Mistral-7B) fine-tuned on impacts from news media in capturing a wider range of categories of impacts that GPT-4 had gaps in covering.
CLMay 15, 2024
Simulating Policy Impacts: Developing a Generative Scenario Writing Method to Evaluate the Perceived Effects of RegulationJulia Barnett, Kimon Kieslich, Nicholas Diakopoulos
The rapid advancement of AI technologies yields numerous future impacts on individuals and society. Policymakers are tasked to react quickly and establish policies that mitigate those impacts. However, anticipating the effectiveness of policies is a difficult task, as some impacts might only be observable in the future and respective policies might not be applicable to the future development of AI. In this work we develop a method for using large language models (LLMs) to evaluate the efficacy of a given piece of policy at mitigating specified negative impacts. We do so by using GPT-4 to generate scenarios both pre- and post-introduction of policy and translating these vivid stories into metrics based on human perceptions of impacts. We leverage an already established taxonomy of impacts of generative AI in the media environment to generate a set of scenario pairs both mitigated and non-mitigated by the transparency policy in Article 50 of the EU AI Act. We then run a user study (n=234) to evaluate these scenarios across four risk-assessment dimensions: severity, plausibility, magnitude, and specificity to vulnerable populations. We find that this transparency legislation is perceived to be effective at mitigating harms in areas such as labor and well-being, but largely ineffective in areas such as social cohesion and security. Through this case study we demonstrate the efficacy of our method as a tool to iterate on the effectiveness of policy for mitigating various negative impacts. We expect this method to be useful to researchers or other stakeholders who want to brainstorm the potential utility of different pieces of policy or other mitigation strategies.
HCJun 5, 2025
Scenarios in Computing Research: A Systematic Review of the Use of Scenario Methods for Exploring the Future of Computing Technologies in SocietyJulia Barnett, Kimon Kieslich, Jasmine Sinchai et al.
Scenario building is an established method to anticipate the future of emerging technologies. Its primary goal is to use narratives to map future trajectories of technology development and sociotechnical adoption. Following this process, risks and benefits can be identified early on, and strategies can be developed that strive for desirable futures. In recent years, computer science has adopted this method and applied it to various technologies, including Artificial Intelligence (AI). Because computing technologies play such an important role in shaping modern societies, it is worth exploring how scenarios are being used as an anticipatory tool in the field -- and what possible traditional uses of scenarios are not yet covered but have the potential to enrich the field. We address this gap by conducting a systematic literature review on the use of scenario building methods in computer science over the last decade (n = 59). We guide the review along two main questions. First, we aim to uncover how scenarios are used in computing literature, focusing especially on the rationale for why scenarios are used. Second, in following the potential of scenario building to enhance inclusivity in research, we dive deeper into the participatory element of the existing scenario building literature in computer science.
CYJan 24, 2025
Envisioning Stakeholder-Action Pairs to Mitigate Negative Impacts of AI: A Participatory Approach to Inform Policy MakingJulia Barnett, Kimon Kieslich, Natali Helberger et al.
The potential for negative impacts of AI has rapidly become more pervasive around the world, and this has intensified a need for responsible AI governance. While many regulatory bodies endorse risk-based approaches and a multitude of risk mitigation practices are proposed by companies and academic scholars, these approaches are commonly expert-centered and thus lack the inclusion of a significant group of stakeholders. Ensuring that AI policies align with democratic expectations requires methods that prioritize the voices and needs of those impacted. In this work we develop a participative and forward-looking approach to inform policy-makers and academics that grounds the needs of lay stakeholders at the forefront and enriches the development of risk mitigation strategies. Our approach (1) maps potential mitigation and prevention strategies of negative AI impacts that assign responsibility to various stakeholders, (2) explores the importance and prioritization thereof in the eyes of laypeople, and (3) presents these insights in policy fact sheets, i.e., a digestible format for informing policy processes. We emphasize that this approach is not targeted towards replacing policy-makers; rather our aim is to present an informative method that enriches mitigation strategies and enables a more participatory approach to policy development.
CYJul 19, 2021
Using automated decision-making (ADM) to allocate Covid-19 vaccinations? Exploring the roles of trust and social group preference on the legitimacy of ADM vs. human decision-makingMarco Lünich, Kimon Kieslich
In combating the ongoing global health threat of the Covid-19 pandemic, decision-makers have to take actions based on a multitude of relevant health data with severe potential consequences for the affected patients. Because of their presumed advantages in handling and analyzing vast amounts of data, computer systems of automated decision-making (ADM) are implemented and substitute humans in decision-making processes. In this study, we focus on a specific application of ADM in contrast to human decision-making (HDM), namely the allocation of Covid-19 vaccines to the public. In particular, we elaborate on the role of trust and social group preference on the legitimacy of vaccine allocation. We conducted a survey with a 2x2 randomized factorial design among n=1602 German respondents, in which we utilized distinct decision-making agents (HDM vs. ADM) and prioritization of a specific social group (teachers vs. prisoners) as design factors. Our findings show that general trust in ADM systems and preference for vaccination of a specific social group influence the legitimacy of vaccine allocation. However, contrary to our expectations, trust in the agent making the decision did not moderate the link between social group preference and legitimacy. Moreover, the effect was also not moderated by the type of decision-maker (human vs. algorithm). We conclude that trustworthy ADM systems must not necessarily lead to the legitimacy of ADM systems.
CYJun 1, 2021
AI-Ethics by Design. Evaluating Public Perception on the Importance of Ethical Design Principles of AIKimon Kieslich, Birte Keller, Christopher Starke
Despite the immense societal importance of ethically designing artificial intelligence (AI), little research on the public perceptions of ethical AI principles exists. This becomes even more striking when considering that ethical AI development has the aim to be human-centric and of benefit for the whole society. In this study, we investigate how ethical principles (explainability, fairness, security, accountability, accuracy, privacy, machine autonomy) are weighted in comparison to each other. This is especially important, since simultaneously considering ethical principles is not only costly, but sometimes even impossible, as developers must make specific trade-off decisions. In this paper, we give first answers on the relative importance of ethical principles given a specific use case - the use of AI in tax fraud detection. The results of a large conjoint survey (n=1099) suggest that, by and large, German respondents found the ethical principles equally important. However, subsequent cluster analysis shows that different preference models for ethically designed systems exist among the German population. These clusters substantially differ not only in the preferred attributes, but also in the importance level of the attributes themselves. We further describe how these groups are constituted in terms of sociodemographics as well as opinions on AI. Societal implications as well as design challenges are discussed.
CYJun 12, 2020
The Threats of Artificial Intelligence Scale (TAI). Development, Measurement and Test Over Three Application DomainsKimon Kieslich, Marco Lünich, Frank Marcinkowski
In recent years Artificial Intelligence (AI) has gained much popularity, with the scientific community as well as with the public. AI is often ascribed many positive impacts for different social domains such as medicine and the economy. On the other side, there is also growing concern about its precarious impact on society and individuals. Several opinion polls frequently query the public fear of autonomous robots and artificial intelligence (FARAI), a phenomenon coming also into scholarly focus. As potential threat perceptions arguably vary with regard to the reach and consequences of AI functionalities and the domain of application, research still lacks necessary precision of a respective measurement that allows for wide-spread research applicability. We propose a fine-grained scale to measure threat perceptions of AI that accounts for four functional classes of AI systems and is applicable to various domains of AI applications. Using a standardized questionnaire in a survey study (N=891), we evaluate the scale over three distinct AI domains (loan origination, job recruitment and medical treatment). The data support the dimensional structure of the proposed Threats of AI (TAI) scale as well as the internal consistency and factoral validity of the indicators. Implications of the results and the empirical application of the scale are discussed in detail. Recommendations for further empirical use of the TAI scale are provided.