Andreas Duenser

AI
3papers
7citations
Novelty33%
AI Score36

3 Papers

AISep 19, 2023
Who to Trust, How and Why: Untangling AI Ethics Principles, Trustworthiness and Trust

Andreas Duenser, David M. Douglas

We present an overview of the literature on trust in AI and AI trustworthiness and argue for the need to distinguish these concepts more clearly and to gather more empirically evidence on what contributes to people s trusting behaviours. We discuss that trust in AI involves not only reliance on the system itself, but also trust in the developers of the AI system. AI ethics principles such as explainability and transparency are often assumed to promote user trust, but empirical evidence of how such features actually affect how users perceive the system s trustworthiness is not as abundance or not that clear. AI systems should be recognised as socio-technical systems, where the people involved in designing, developing, deploying, and using the system are as important as the system for determining whether it is trustworthy. Without recognising these nuances, trust in AI and trustworthy AI risk becoming nebulous terms for any desirable feature for AI systems.

80.7HCMar 12
To Believe or Not To Believe: Comparing Supporting Information Tools to Aid Human Judgments of AI Veracity

Jessica Irons, Patrick Cooper, Necva Bolucu et al.

With increasing awareness of the hallucination risks of generative artificial intelligence (AI), we see a growing shift toward providing information tooling to help users determine the veracity of AI-generated answers for themselves. User responsibility for assessing veracity is particularly critical for certain sectors that rely on on-demand, AI-generated data extraction, such as biomedical research and the legal sector. While prior work offers us a variety of ways in which systems can provide such support, there is a lack of empirical evidence on how this information is actually incorporated into the user's decision-making process. Our user study takes a step toward filling this knowledge gap. In the context of a generative AI data extraction tool, we examine the relationship between the type of supporting information (full source text, passage retrieval, and Large Language Model (LLM) explanations) and user behavior in the veracity assessment process, examined through the lens of efficiency, effectiveness, reliance and trust. We find that passage retrieval offers a reasonable compromise between accuracy and speed, with judgments of veracity comparable to using the full source text. LLM explanations, while also enabling rapid assessments, fostered inappropriate reliance and trust on the data extraction AI, such that participants were less likely to detect errors. In additiona, we analyzed the impacts of the complexity of the information need, finding preliminary evidence that inappropriate reliance is worse for complex answers. We demonstrate how, through rigorous user evaluation, we can better develop systems that allow for effective and responsible human agency in veracity assessment processes.

11.3CLMar 13
Using a Human-AI Teaming Approach to Create and Curate Scientific Datasets with the SCILIRE System

Necva Bölücü, Jessica Irons, Changhyun Lee et al.

The rapid growth of scientific literature has made manual extraction of structured knowledge increasingly impractical. To address this challenge, we introduce SCILIRE, a system for creating datasets from scientific literature. SCILIRE has been designed around Human-AI teaming principles centred on workflows for verifying and curating data. It facilitates an iterative workflow in which researchers can review and correct AI outputs. Furthermore, this interaction is used as a feedback signal to improve future LLM-based inference. We evaluate our design using a combination of intrinsic benchmarking outcomes together with real-world case studies across multiple domains. The results demonstrate that SCILIRE improves extraction fidelity and facilitates efficient dataset creation.