CYMar 19
Constitutive vs. Corrective: A Causal Taxonomy of Human Runtime Involvement in AI SystemsKevin Baum, Johann Laux
As AI systems increasingly permeate high-stakes decision-making, the terminology regarding human involvement - Human-in-the-Loop (HITL), Human-on-the-Loop (HOTL), and Human Oversight - has become vexingly ambiguous. This ambiguity complicates interdisciplinary collaboration between computer science, law, philosophy, psychology, and sociology and can lead to regulatory uncertainty. We propose a clarification grounded in causal structure, focused on human involvement during the runtime of AI systems. The distinction between HITL and HOTL, we argue, is not primarily spatial but causal: HITL is constitutive (a human contribution is necessary for the decision output), while HOTL is corrective (external to the primary causal chain, capable of preventing or modifying outputs). Within HOTL, we distinguish three temporal modes - synchronous, asynchronous, and anticipatory - situated within a nested model of provider and deployer runtime that clarifies their different capacities for intervention. A second, orthogonal dimension captures cognitive integration: whether human and machine operate as complementary or hybrid intelligence, yielding four structurally distinct configurations. Finally, we distinguish these descriptive categories from the normative requirements they serve: statutory "Human Oversight" is a specific normative mode of HOTL that demands not merely a corrective causal position, but genuine preparedness and capacity for effective intervention. Because the same person may occupy both HITL and HOTL roles simultaneously, we argue that this role duality must be treated as a design problem requiring architectural and epistemic mitigation rather than mere acknowledgment.
CYApr 9
Keeping an Eye on AI: A Framework for Effective Human Oversight of AI SystemsSusanne Gaube, Markus Langer, Tim Miller et al.
The use of Artificial Intelligence (AI) in high-risk, decision-making scenarios presents technical, safety, and normative challenges; problems that may only be ameliorated by human oversight. However, notions of human oversight lack a common foundational understanding: oversight architectures are not well defined, the roles involved remain unclear, and implementation steps are opaque. Hence, researchers and practitioners struggle to determine how to design, implement, and evaluate systems that enable effective human oversight. This paper advances a practical framework for effective human oversight of AI systems, based on a cross-disciplinary perspective that draws on insights from computer science, human-computer interaction, psychology, philosophy, and law. The core contributions are: (1) a foundational framework, with a working definition, architecture and processes for effective human oversight of AI systems; (2) an initial template for documenting oversight architectures and processes, applied to diverse domains; and (3) a synthesis of open research challenges that need to be considered in the emerging field of effective human oversight of AI systems.
GNDec 22, 2023
Improving Task Instructions for Data Annotators: How Clear Rules and Higher Pay Increase Performance in Data Annotation in the AI EconomyJohann Laux, Fabian Stephany, Alice Liefgreen
The global surge in AI applications is transforming industries, leading to displacement and complementation of existing jobs, while also giving rise to new employment opportunities. Data annotation, encompassing the labelling of images or annotating of texts by human workers, crucially influences the quality of a dataset directly influences the quality of AI models trained on it. This paper delves into the economics of data annotation, with a specific focus on the impact of task instruction design (that is, the choice between rules and standards as theorised in law and economics) and monetary incentives on data quality and costs. An experimental study involving 307 data annotators examines six groups with varying task instructions (norms) and monetary incentives. Results reveal that annotators provided with clear rules exhibit higher accuracy rates, outperforming those with vague standards by 14%. Similarly, annotators receiving an additional monetary incentive perform significantly better, with the highest accuracy rate recorded in the group working with both clear rules and incentives (87.5% accuracy). In addition, our results show that rules are perceived as being more helpful by annotators than standards and reduce annotators' difficulty in annotating images. These empirical findings underscore the double benefit of rule-based instructions on both data quality and worker wellbeing. Our research design allows us to reveal that, in our study, rules are more cost-efficient in increasing accuracy than monetary incentives. The paper contributes experimental insights to discussions on the economical, ethical, and legal considerations of AI technologies. Addressing policymakers and practitioners, we emphasise the need for a balanced approach in optimising data annotation processes for efficient and ethical AI development and usage.