Ben Zevenbergen

22.3CYApr 13

Epistemic Trust as a Mechanism for Ethics Integration: Failure Modes and Design Principles from 70 Moral Imagination Workshops

Benjamin Lange, Geoff Keeling, Kyle Pedersen et al.

Bottom-up responsible innovation initiatives seek to empower technology development teams to engage in ethical reflection, yet such interventions frequently fail to achieve practitioner engagement. Why do some ethics interventions succeed while others are dismissed as irrelevant, adversarial, or disconnected from work? This paper proposes epistemic trust -- the degree to which practitioners regard an intervention, its facilitators, and its content as credible, relevant, and actionable -- as a conceptual model linking intervention design to engagement outcomes. Drawing on philosophical work on testimony and on practice-based qualitative analysis of over 70 moral imagination workshops with engineering teams between 2019 and 2025, we identify five dimensions of epistemic trust salient to ethics interventions (Relevance, Inclusivity, Agency, Authority, and Alignment) and present a typology of 23 failure modes that arise when these dimensions are inadequately addressed. We derive nine design principles for cultivating epistemic trust, grounded in our operationalisation of moral imagination through technomoral scenarios and structured deliberation. Our findings contribute to the literature on collaborative socio-technical integration by specifying conditions of uptake that existing frameworks leave undertheorised. We acknowledge limitations including selection effects from voluntary participation and the absence of formal outcome measures, and position our failure mode typology as practitioner hypotheses warranting further empirical validation.

CLJan 20, 2022

LaMDA: Language Models for Dialog Applications

Romal Thoppilan, Daniel De Freitas, Jamie Hall et al.

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.

Ben Zevenbergen

2 Papers