Julian Skirzynski

AI
h-index6
3papers
2citations
Novelty35%
AI Score36

3 Papers

66.4LGJun 3
Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models

Julian Skirzynski, Harry Cheon, Shreyas Kadekodi et al.

Concept bottleneck models predict outcomes from high-level concepts detected in inputs. Although concepts provide a simple way to reap benefits from interpretability, very few datasets include concept labels. This limits researchers' ability to determine which problems are suitable for these models, isolate the factors that drive their performance or lead to failures, or uncover which algorithms perform well. In this paper, we develop synthetic benchmarks for concept-bottleneck models, focusing on their two main use cases: decision support, in which models assist humans in making better decisions, and automation, in which models handle routine tasks without supervision. Our benchmarks can generate labeled datasets while controlling for properties that affect performance, including data modality, concept choice, annotation quality, and completeness. We demonstrate how the benchmarks can be used to evaluate representative classes of concept bottleneck models. Our demonstrations show how the benchmarks can diagnose failure modes and guide follow-up testing.

CLJul 3, 2025
How Much Content Do LLMs Generate That Induces Cognitive Bias in Users?

Abeer Alessa, Akshaya Lakshminarasimhan, Param Somane et al.

Large language models (LLMs) are increasingly integrated into applications ranging from review summarization to medical diagnosis support, where they affect human decisions. Even though LLMs perform well in many tasks, they may also inherit societal or cognitive biases, which can inadvertently transfer to humans. We investigate when and how LLMs expose users to biased content and quantify its severity. Specifically, we assess three LLM families in summarization and news fact-checking tasks, evaluating how much LLMs stay consistent with their context and/or hallucinate. Our findings show that LLMs expose users to content that changes the sentiment of the context in 21.86% of the cases, hallucinates on post-knowledge-cutoff data questions in 57.33% of the cases, and primacy bias in 5.94% of the cases. We evaluate 18 distinct mitigation methods across three LLM families and find that targeted interventions can be effective. Given the prevalent use of LLMs in high-stakes domains, such as healthcare or legal analysis, our results highlight the need for robust technical safeguards and for developing user-centered interventions that address LLM limitations.

AISep 29, 2021
Automatic Discovery and Description of Human Planning Strategies

Julian Skirzynski, Yash Raj Jain, Falk Lieder

Scientific discovery concerns finding patterns in data and creating insightful hypotheses that explain these patterns. Traditionally, this process required human ingenuity, but with the galloping advances in artificial intelligence (AI) it becomes feasible to automate some parts of scientific discovery. In this work we leverage AI for strategy discovery for understanding human planning. In the state-of-the-art methods data about the process of human planning is often used to group similar behaviors together and formulate verbal descriptions of the strategies which might underlie those groups. Here, we automate these two steps. Our method utilizes a new algorithm, called Human-Interpret, that performs imitation learning to describe sequences of planning operations in terms of a procedural formula and then translates that formula to natural language. We test our method on a benchmark data set that researchers have previously scrutinized manually. We find that the descriptions of human planning strategies obtained automatically are about as understandable as human-generated descriptions. They also cover a substantial proportion of of relevant types of human planning strategies that had been discovered manually. Our method saves scientists' time and effort as all the reasoning about human planning is done automatically. This might make it feasible to more rapidly scale up the search for yet undiscovered cognitive strategies to many new decision environments, populations, tasks, and domains. Given these results, we believe that the presented work may accelerate scientific discovery in psychology, and due to its generality, extend to problems from other fields.