AIApr 18, 2024

Concept Induction using LLMs: a user experiment for assessment

Adrita Barua, Cara Widmer, Pascal Hitzler

arXiv:2404.11875v28.52 citations

Originality Incremental advance

AI Analysis

This addresses the problem of costly and limited concept discovery in XAI for researchers and practitioners, but it is incremental as it builds on existing methods with a new LLM application.

The study tackled the challenge of generating interpretable concepts for explainable AI in image classification by using GPT-4 to produce high-level concepts from minimal textual data, finding that while human-generated concepts were superior, GPT-4's concepts were more comprehensible than those from the ECII system.

Explainable Artificial Intelligence (XAI) poses a significant challenge in providing transparent and understandable insights into complex AI models. Traditional post-hoc algorithms, while useful, often struggle to deliver interpretable explanations. Concept-based models offer a promising avenue by incorporating explicit representations of concepts to enhance interpretability. However, existing research on automatic concept discovery methods is often limited by lower-level concepts, costly human annotation requirements, and a restricted domain of background knowledge. In this study, we explore the potential of a Large Language Model (LLM), specifically GPT-4, by leveraging its domain knowledge and common-sense capability to generate high-level concepts that are meaningful as explanations for humans, for a specific setting of image classification. We use minimal textual object information available in the data via prompting to facilitate this process. To evaluate the output, we compare the concepts generated by the LLM with two other methods: concepts generated by humans and the ECII heuristic concept induction system. Since there is no established metric to determine the human understandability of concepts, we conducted a human study to assess the effectiveness of the LLM-generated concepts. Our findings indicate that while human-generated explanations remain superior, concepts derived from GPT-4 are more comprehensible to humans compared to those generated by ECII.

View on arXiv PDF

Similar