CYJun 3
Prioritization of Risks from Artificial Intelligence: A Delphi Study of 272 International ExpertsAlexander K. Saeri, Jess Graham, Michael Noetel et al.
Artificial intelligence poses many risks, ranging from familiar present-day harms to unprecedented and potentially catastrophic ones. Effective risk management requires prioritization: we must understand which risks are most severe, who is most vulnerable, and who is most responsible for addressing them. We report results from a three-round Delphi study conducted late 2025 with 272 international AI experts. Experts rated 24 AI risks on harm probability and severity, sector and actor vulnerability, actor responsibility, and overall concern. Experts estimated the five most severe harms in the next 5 years were likely to come from dangerous capabilities, competitive dynamics, weapons & cyberattacks (including CBRNE), power centralization, and false information. In a business-as-usual scenario, experts judged 18 of 24 risks as having a more than 10% probability of catastrophic outcomes (e.g., more than 1 million deaths or more than USD 100B in financial loss) in the next 5 years (2025-2030). In a scenario where pragmatic mitigations are implemented, experts still judged five risks as having a more than 10% probability of catastrophic outcomes: dangerous capabilities, weapons & cyberattacks, environmental harm, inequality & unemployment, and power centralization. All 24 risks were judged as being more than 5% likely to cause catastrophic outcomes. AI users and the general public were judged the most vulnerable to these risks, but experts assigned the highest responsibility for addressing them to general-purpose AI developers and governance actors (including governments, regulators, and standards bodies). Across most risks, experts identified information, finance, and national security as the most vulnerable sectors. These findings can guide AI risk prioritization and clarify expert expectations about who should bear responsibility for mitigation.
CVMay 19Code
A Framework for Evaluating Zero-Shot Image Generation in Concept-based ExplainabilityGiacomo Astolfi, Matteo Bianchi, Riccardo Campi et al.
Concept-based Explainable Artificial Intelligence (XAI) interprets deep learning models using human-understandable visual features (e.g., textures or object parts) by linking internal representations to class predictions, thereby bridging the gap between low-level image data and high-level semantics. A major challenge, however, is the reliance on large sets of labeled images to represent each concept, which limits scalability. In this work, we investigate the use of zero-shot Text-to-Image (T2I) generative models as a source of synthetic concept datasets for concept-based XAI methods. Specifically, we generate concepts using predefined prompts and evaluate their faithfulness to real ones through four complementary analyses: (1) comparing synthetic vs. real concept images via concept representation similarity; (2) evaluating their intra-similarity by comparing pairs of subsets of the same concept with progressively increasing size; (3) evaluating their performance for downstream explanation tasks using relevant class images; (4) evaluating how removing a concept from tested class images affects explanations of generated concepts. While current T2I generative models promise a shortcut to concept-based XAI, our study highlights challenges and raises open questions about the use of synthetic data generated by zero-shot pipelines in model analyses. The resulting dataset is available at https://github.com/DataSciencePolimi/ZeroShot-T2I-Concepts.
CLApr 18Code
Comparing Human and Large Language Model Interpretation of Implicit InformationAntonio De Santis, Tommaso Bonetti, Andrea Tocchetti et al.
The interpretation of implicit meanings is an integral aspect of human communication. However, this framework may not transfer to interactions with Large Language Models (LLMs). To investigate this, we introduce the task of Implicit Information Extraction (IIE) and propose an LLM-based IIE pipeline that builds a structured knowledge graph from a context sentence by extracting relational triplets, validating implicit inferences, and analyzing temporal relations. We evaluate two LLMs against crowdsourced human judgments on two datasets. We find that humans agree with most model triplets yet consistently propose many additions, indicating limited coverage in current LLM-based IIE. Moreover, in our experiments, models appear to be more conservative about implicit inferences than humans in socially rich contexts, whereas humans become more conservative in shorter, fact-oriented contexts. Our code is available at https://github.com/Antonio-Dee/IIE_from_LLM.
AIAug 3, 2024
Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test DataAntonio De Santis, Marco Balduini, Federico De Santis et al.
Aerospace manufacturing companies, such as Thales Alenia Space, design, develop, integrate, verify, and validate products characterized by high complexity and low volume. They carefully document all phases for each product but analyses across products are challenging due to the heterogeneity and unstructured nature of the data in documents. In this paper, we propose a hybrid methodology that leverages Knowledge Graphs (KGs) in conjunction with Large Language Models (LLMs) to extract and validate data contained in these documents. We consider a case study focused on test data related to electronic boards for satellites. To do so, we extend the Semantic Sensor Network ontology. We store the metadata of the reports in a KG, while the actual test results are stored in parquet accessible via a Virtual Knowledge Graph. The validation process is managed using an LLM-based approach. We also conduct a benchmarking study to evaluate the performance of state-of-the-art LLMs in executing this task. Finally, we analyze the costs and benefits of automating preexisting processes of manual data extraction and validation for subsequent cross-report analyses.
IRMar 20
LLM-Enhanced Semantic Data Integration of Electronic Component Qualifications in the Aerospace DomainAntonio De Santis, Marco Balduini, Matteo Belcao et al.
Large manufacturing companies face challenges in information retrieval due to data silos maintained by different departments, leading to inconsistencies and misalignment across databases. This paper presents an experience in integrating and retrieving qualification data for electronic components used in satellite board design. Due to data silos, designers cannot immediately determine the qualification status of individual components. However, this process is critical during the planning phase, when assembly drawings are issued before production, to optimize new qualifications and avoid redundant efforts. To address this, we propose a pipeline that uses Virtual Knowledge Graphs for a unified view over heterogeneous data sources and LLMs to enhance retrieval and reduce manual effort in data cleansing. The retrieval of qualifications is then performed through an Ontology-based Data Access approach for structured queries and a vector search mechanism for retrieving qualifications based on similar textual properties. We perform a comparative cost-benefit analysis, demonstrating that the proposed pipeline also outperforms approaches relying solely on LLMs, such as Retrieval-Augmented Generation (RAG), in terms of long-term efficiency.
CVNov 8, 2024Code
Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image ClassificationAntonio De Santis, Riccardo Campi, Matteo Bianchi et al.
Convolutional Neural Networks (CNNs) have seen significant performance improvements in recent years. However, due to their size and complexity, they function as black-boxes, leading to transparency concerns. State-of-the-art saliency methods generate local explanations that highlight the area in the input image where a class is identified but cannot explain how a concept of interest contributes to the prediction, which is essential for bias mitigation. On the other hand, concept-based methods, such as TCAV (Testing with Concept Activation Vectors), provide insights into how sensitive is the network to a concept, but cannot compute its attribution in a specific prediction nor show its location within the input image. This paper introduces a novel post-hoc explainability framework, Visual-TCAV, which aims to bridge the gap between these methods by providing both local and global explanations for CNN-based image classification. Visual-TCAV uses Concept Activation Vectors (CAVs) to generate saliency maps that show where concepts are recognized by the network. Moreover, it can estimate the attribution of these concepts to the output of any class using a generalization of Integrated Gradients. This framework is evaluated on popular CNN architectures, with its validity further confirmed via experiments where ground truth for explanations is known, and a comparison with TCAV. Our code is available at https://github.com/DataSciencePolimi/Visual-TCAV.
LGMar 7Code
Learning Concept Bottleneck Models from Mechanistic ExplanationsAntonio De Santis, Schrasing Tong, Marco Brambilla et al.
Concept Bottleneck Models (CBMs) aim for ante-hoc interpretability by learning a bottleneck layer that predicts interpretable concepts before the decision. State-of-the-art approaches typically select which concepts to learn via human specification, open knowledge graphs, prompting an LLM, or using general CLIP concepts. However, concepts defined a-priori may not have sufficient predictive power for the task or even be learnable from the available data. As a result, these CBMs often significantly trail their black-box counterpart when controlling for information leakage. To address this, we introduce a novel CBM pipeline named Mechanistic CBM (M-CBM), which builds the bottleneck directly from a black-box model's own learned concepts. These concepts are extracted via Sparse Autoencoders (SAEs) and subsequently named and annotated on a selected subset of images using a Multimodal LLM. For fair comparison and leakage control, we also introduce the Number of Contributing Concepts (NCC), a decision-level sparsity metric that extends the recently proposed NEC metric. Across diverse datasets, we show that M-CBMs consistently surpass prior CBMs at matched sparsity, while improving concept predictions and providing concise explanations. Our code is available at https://github.com/Antonio-Dee/M-CBM.
LGMay 6, 2024
Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image ClassificationMatteo Bianchi, Antonio De Santis, Andrea Tocchetti et al.
Transparency and explainability in image classification are essential for establishing trust in machine learning models and detecting biases and errors. State-of-the-art explainability methods generate saliency maps to show where a specific class is identified, without providing a detailed explanation of the model's decision process. Striving to address such a need, we introduce a post-hoc method that explains the entire feature extraction process of a Convolutional Neural Network. These explanations include a layer-wise representation of the features the model extracts from the input. Such features are represented as saliency maps generated by clustering and merging similar feature maps, to which we associate a weight derived by generalizing Grad-CAM for the proposed methodology. To further enhance these explanations, we include a set of textual labels collected through a gamified crowdsourcing activity and processed using NLP techniques and Sentence-BERT. Finally, we show an approach to generate global explanations by aggregating labels across multiple images.