HC AISep 19, 2024

Exploring the Lands Between: A Method for Finding Differences between AI-Decisions and Human Ratings through Generated Samples

Lukas Mecke, Daniel Buschek, Uwe Gruenefeld, Florian Alt

arXiv:2409.12801v12.7h-index: 49

Originality Incremental advance

AI Analysis

This addresses the issue of AI-human misalignment in critical decisions like biometric authentication, providing a tool to test models beyond clear-cut data, though it is incremental as it builds on existing generative and evaluation methods.

The paper tackles the problem of AI decisions misaligning with human expectations by proposing a method to generate challenging samples from a generative model's latent space, which are then evaluated by both AI and human raters to identify discrepancies. They applied this to a face recognition model, collecting 11,200 human ratings from 100 participants to analyze alignment and contradictions.

Many important decisions in our everyday lives, such as authentication via biometric models, are made by Artificial Intelligence (AI) systems. These can be in poor alignment with human expectations, and testing them on clear-cut existing data may not be enough to uncover those cases. We propose a method to find samples in the latent space of a generative model, designed to be challenging for a decision-making model with regard to matching human expectations. By presenting those samples to both the decision-making model and human raters, we can identify areas where its decisions align with human intuition and where they contradict it. We apply this method to a face recognition model and collect a dataset of 11,200 human ratings from 100 participants. We discuss findings from our dataset and how our approach can be used to explore the performance of AI models in different contexts and for different user groups.

View on arXiv PDF

Similar