Andrea Giovanni Nuzzolese

h-index38

19papers

316citations

Novelty33%

AI Score49

Ranked #45,618 of 201,326 authors (top 23%)#2,457 in AI (top 17%)

19 Papers

AIJun 6, 2022

Automatically Drafting Ontologies from Competency Questions with FrODO

Aldo Gangemi, Anna Sofia Lippolis, Giorgia Lodi et al.

We present the Frame-based ontology Design Outlet (FrODO), a novel method and tool for drafting ontologies from competency questions automatically. Competency questions are expressed as natural language and are a common solution for representing requirements in a number of agile ontology engineering methodologies, such as the eXtreme Design (XD) or SAMOD. FrODO builds on top of FRED. In fact, it leverages the frame semantics for drawing domain-relevant boundaries around the RDF produced by FRED from a competency question, thus drafting domain ontologies. We carried out a user-based study for assessing FrODO in supporting engineers for ontology design tasks. The study shows that FrODO is effective in this and the resulting ontology drafts are qualitative.

DBFeb 23

The Climate Change Knowledge Graph: Supporting Climate Services

Miguel Ceriani, Fiorela Ciroku, Alessandro Russo et al.

Climate change impacts a broad spectrum of human resources and activities, necessitating the use of climate models to project long-term effects and inform mitigation and adaptation strategies. These models generate multiple datasets by running simulations across various scenarios and configurations, thereby covering a range of potential future outcomes. Currently, researchers rely on traditional search interfaces and APIs to retrieve such datasets, often piecing together information from metadata and community vocabularies. The Climate Change Knowledge Graph is designed to address these challenges by integrating diverse data sources related to climate simulations into a coherent and interoperable knowledge graph. This innovative resource allows for executing complex queries involving climate models, simulations, variables, spatio-temporal domains, and granularities. Developed with input from domain experts, the knowledge graph and its underlying ontology are published with open access license and provide a comprehensive framework that enhances the exploration of climate data, facilitating more informed decision-making in addressing climate change issues.

22.7AIMay 8

Tacit Knowledge Extraction via Logic Augmented Generation and Active Inference

Lorenzo Lamazzi, Aldo Gangemi, Alessio Giberti et al.

Tacit knowledge plays a central role in human expertise, yet it remains difficult to capture, formalize, and reuse in machine-interpretable form. This challenge is especially relevant in procedural domains, where successful execution depends not only on explicit instructions, but also on implicit assumptions, contextual constraints, embodied skills, and experience-based judgments rarely documented. As a result, current knowledge engineering pipelines struggle to transform tacit and process-centric knowledge into formally specified, machine-interpretable representations that can be queried, validated, reasoned over, and reused. In this paper, we introduce a neuro-symbolic framework that combines Logic-Augmented Generation and an Active-Inference-inspired approach for ontology-grounded Knowledge Graph construction. We evaluate the approach in a knowledge transfer case study in manufacturing, using assembly-like repair procedures from instructional videos as a reproducible proxy domain. Results show that the proposed solution improves completeness and semantic quality, advancing neuro-symbolic knowledge engineering for industrial domains.

AIMar 7, 2025

Ontology Generation using Large Language Models

Anna Sofia Lippolis, Mohammad Javad Saeedizade, Robin Keskisärkkä et al.

The ontology engineering process is complex, time-consuming, and error-prone, even for experienced ontology engineers. In this work, we investigate the potential of Large Language Models (LLMs) to provide effective OWL ontology drafts directly from ontological requirements described using user stories and competency questions. Our main contribution is the presentation and evaluation of two new prompting techniques for automated ontology development: Memoryless CQbyCQ and Ontogenia. We also emphasize the importance of three structural criteria for ontology assessment, alongside expert qualitative evaluation, highlighting the need for a multi-dimensional evaluation in order to capture the quality and usability of the generated ontologies. Our experiments, conducted on a benchmark dataset of ten ontologies with 100 distinct CQs and 29 different user stories, compare the performance of three LLMs using the two prompting techniques. The results demonstrate improvements over the current state-of-the-art in LLM-supported ontology engineering. More specifically, the model OpenAI o1-preview with Ontogenia produces ontologies of sufficient quality to meet the requirements of ontology engineers, significantly outperforming novice ontology engineers in modelling ability. However, we still note some common mistakes and variability of result quality, which is important to take into account when using LLMs for ontology authoring support. We discuss these limitations and propose directions for future research.

AIApr 24, 2025

Assessing the Capability of Large Language Models for Domain-Specific Ontology Generation

Anna Sofia Lippolis, Mohammad Javad Saeedizade, Robin Keskisarkka et al.

Large Language Models (LLMs) have shown significant potential for ontology engineering. However, it is still unclear to what extent they are applicable to the task of domain-specific ontology generation. In this study, we explore the application of LLMs for automated ontology generation and evaluate their performance across different domains. Specifically, we investigate the generalizability of two state-of-the-art LLMs, DeepSeek and o1-preview, both equipped with reasoning capabilities, by generating ontologies from a set of competency questions (CQs) and related user stories. Our experimental setup comprises six distinct domains carried out in existing ontology engineering projects and a total of 95 curated CQs designed to test the models' reasoning for ontology engineering. Our findings show that with both LLMs, the performance of the experiments is remarkably consistent across all domains, indicating that these methods are capable of generalizing ontology generation tasks irrespective of the domain. These results highlight the potential of LLM-based approaches in achieving scalable and domain-agnostic ontology construction and lay the groundwork for further research into enhancing automated reasoning and knowledge representation techniques.

AINov 21, 2024

Logic Augmented Generation

Aldo Gangemi, Andrea Giovanni Nuzzolese

Semantic Knowledge Graphs (SKG) face challenges with scalability, flexibility, contextual understanding, and handling unstructured or ambiguous information. However, they offer formal and structured knowledge enabling highly interpretable and reliable results by means of reasoning and querying. Large Language Models (LLMs) overcome those limitations making them suitable in open-ended tasks and unstructured environments. Nevertheless, LLMs are neither interpretable nor reliable. To solve the dichotomy between LLMs and SKGs we envision Logic Augmented Generation (LAG) that combines the benefits of the two worlds. LAG uses LLMs as Reactive Continuous Knowledge Graphs that can generate potentially infinite relations and tacit knowledge on-demand. SKGs are key for injecting a discrete heuristic dimension with clear logical and factual boundaries. We exemplify LAG in two tasks of collective intelligence, i.e., medical diagnostics and climate projections. Understanding the properties and limitations of LAG, which are still mostly unknown, is of utmost importance for enabling a variety of tasks involving tacit knowledge in order to provide interpretable and effective results.

AIJul 19, 2025

Large Language Models Assisting Ontology Evaluation

Anna Sofia Lippolis, Mohammad Javad Saeedizade, Robin Keskisärkkä et al.

Ontology evaluation through functional requirements, such as testing via competency question (CQ) verification, is a well-established yet costly, labour-intensive, and error-prone endeavour, even for ontology engineering experts. In this work, we introduce OE-Assist, a novel framework designed to assist ontology evaluation through automated and semi-automated CQ verification. By presenting and leveraging a dataset of 1,393 CQs paired with corresponding ontologies and ontology stories, our contributions present, to our knowledge, the first systematic investigation into large language model (LLM)-assisted ontology evaluation, and include: (i) evaluating the effectiveness of a LLM-based approach for automatically performing CQ verification against a manually created gold standard, and (ii) developing and assessing an LLM-powered framework to assist CQ verification with Protégé, by providing suggestions. We found that automated LLM-based evaluation with o1-preview and o3-mini perform at a similar level to the average user's performance.

CLMay 30, 2025

Bench4KE: Benchmarking Automated Competency Question Generation

Anna Sofia Lippolis, Minh Davide Ragagni, Paolo Ciancarini et al.

The availability of Large Language Models (LLMs) presents a unique opportunity to reinvigorate research on Knowledge Engineering (KE) automation, a trend already evident in recent efforts developing LLM-based methods and tools for the automatic generation of Competency Questions (CQs). However, the evaluation of these tools lacks standardisation. This undermines the methodological rigour and hinders the replication and comparison of results. To address this gap, we introduce Bench4KE, an extensible API-based benchmarking system for KE automation. Its first release focuses on evaluating tools that generate CQs automatically. CQs are natural language questions used by ontology engineers to define the functional requirements of an ontology. Bench4KE provides a curated gold standard consisting of CQ datasets from four real-world ontology projects. It uses a suite of similarity metrics to assess the quality of the CQs generated. We present a comparative analysis of four recent CQ generation systems, which are based on LLMs, establishing a baseline for future research. Bench4KE is also designed to accommodate additional KE automation tasks, such as SPARQL query generation, ontology testing and drafting. Code and datasets are publicly available under the Apache 2.0 license.

DBMay 27, 2025

Streamlining Knowledge Graph Creation with PyRML

Andrea Giovanni Nuzzolese

Knowledge Graphs (KGs) are increasingly adopted as a foundational technology for integrating heterogeneous data in domains such as climate science, cultural heritage, and the life sciences. Declarative mapping languages like R2RML and RML have played a central role in enabling scalable and reusable KG construction, offering a transparent means of transforming structured and semi-structured data into RDF. In this paper, we present PyRML, a lightweight, Python-native library for building Knowledge Graphs through declarative mappings. PyRML supports core RML constructs and provides a programmable interface for authoring, executing, and testing mappings directly within Python environments. It integrates with popular data and semantic web libraries (e.g., Pandas and RDFlib), enabling transparent and modular workflows. By lowering the barrier to entry for KG creation and fostering reproducible, ontology-aligned data integration, PyRML bridges the gap between declarative semantics and practical KG engineering.

AIApr 15, 2025

Enhancing multimodal analogical reasoning with Logic Augmented Generation

Anna Sofia Lippolis, Andrea Giovanni Nuzzolese, Aldo Gangemi

Recent advances in Large Language Models have demonstrated their capabilities across a variety of tasks. However, automatically extracting implicit knowledge from natural language remains a significant challenge, as machines lack active experience with the physical world. Given this scenario, semantic knowledge graphs can serve as conceptual spaces that guide the automated text generation reasoning process to achieve more efficient and explainable results. In this paper, we apply a logic-augmented generation (LAG) framework that leverages the explicit representation of a text through a semantic knowledge graph and applies it in combination with prompt heuristics to elicit implicit analogical connections. This method generates extended knowledge graph triples representing implicit meaning, enabling systems to reason on unlabeled multimodal data regardless of the domain. We validate our work through three metaphor detection and understanding tasks across four datasets, as they require deep analogical reasoning capabilities. The results show that this integrated approach surpasses current baselines, performs better than humans in understanding visual metaphors, and enables more explainable reasoning processes, though still has inherent limitations in metaphor understanding, especially for domain-specific metaphors. Furthermore, we propose a thorough error analysis, discussing issues with metaphorical annotations and current evaluation methods.

AINov 21, 2025

The Belief-Desire-Intention Ontology for modelling mental reality and agency

Sara Zuppiroli, Carmelo Fabio Longo, Anna Sofia Lippolis et al.

The Belief-Desire-Intention (BDI) model is a cornerstone for representing rational agency in artificial intelligence and cognitive sciences. Yet, its integration into structured, semantically interoperable knowledge representations remains limited. This paper presents a formal BDI Ontology, conceived as a modular Ontology Design Pattern (ODP) that captures the cognitive architecture of agents through beliefs, desires, intentions, and their dynamic interrelations. The ontology ensures semantic precision and reusability by aligning with foundational ontologies and best practices in modular design. Two complementary lines of experimentation demonstrate its applicability: (i) coupling the ontology with Large Language Models (LLMs) via Logic Augmented Generation (LAG) to assess the contribution of ontological grounding to inferential coherence and consistency; and (ii) integrating the ontology within the Semas reasoning platform, which implements the Triples-to-Beliefs-to-Triples (T2B2T) paradigm, enabling a bidirectional flow between RDF triples and agent mental states. Together, these experiments illustrate how the BDI Ontology acts as both a conceptual and operational bridge between declarative and procedural intelligence, paving the way for cognitively grounded, explainable, and semantically interoperable multi-agent and neuro-symbolic systems operating within the Web of Data.

CLAug 11, 2025

The Medical Metaphors Corpus (MCC)

Anna Sofia Lippolis, Andrea Giovanni Nuzzolese, Aldo Gangemi

Metaphor is a fundamental cognitive mechanism that shapes scientific understanding, enabling the communication of complex concepts while potentially constraining paradigmatic thinking. Despite the prevalence of figurative language in scientific discourse, existing metaphor detection resources primarily focus on general-domain text, leaving a critical gap for domain-specific applications. In this paper, we present the Medical Metaphors Corpus (MCC), a comprehensive dataset of 792 annotated scientific conceptual metaphors spanning medical and biological domains. MCC aggregates metaphorical expressions from diverse sources including peer-reviewed literature, news media, social media discourse, and crowdsourced contributions, providing both binary and graded metaphoricity judgments validated through human annotation. Each instance includes source-target conceptual mappings and perceived metaphoricity scores on a 0-7 scale, establishing the first annotated resource for computational scientific metaphor research. Our evaluation demonstrates that state-of-the-art language models achieve modest performance on scientific metaphor detection, revealing substantial room for improvement in domain-specific figurative language understanding. MCC enables multiple research applications including metaphor detection benchmarking, quality-aware generation systems, and patient-centered communication tools.

DBMay 18, 2023

The Water Health Open Knowledge Graph

Gianluca Carletti, Elio Giulianelli, Anna Sofia Lippolis et al.

Recently, an increasing interest in the management of water and health resources has been recorded. This interest is fed by the global sustainability challenges posed to the humanity that have water scarcity and quality at their core. Thus, the availability of effective, meaningful and open data is crucial to address those issues in the broader context of the Sustainable Development Goals of clean water and sanitation as targeted by the United Nations. In this paper, we present the Water Health Open Knowledge Graph (WHOW-KG) along with its design methodology and analysis on impact. WHOW-KG is a semantic knowledge graph that models data on water consumption, pollution, infectious disease rates and drug distribution. The WHOW-KG is developed in the context of the EU-funded WHOW (Water Health Open Knowledge) project and aims at supporting a wide range of applications: from knowledge discovery to decision-making, making it a valuable resource for researchers, policymakers, and practitioners in the water and health domains. The WHOW-KG consists of a network of five ontologies and related linked open data, modelled according to those ontologies.

AIJan 1, 2021

An Ontology Design Pattern for representing Recurrent Situations

Valentina Anita Carriero, Aldo Gangemi, Andrea Giovanni Nuzzolese et al.

In this paper, we present an Ontology Design Pattern for representing situations that recur at regular periods and share some invariant factors, which unify them conceptually: we refer to this set of recurring situations as recurrent situation series. The proposed pattern appears to be foundational, since it can be generalised for modelling the top-level domain-independent concept of recurrence, which is strictly associated with invariance. The pattern reuses other foundational patterns such as Collection, Description and Situation, Classification, Sequence. Indeed, a recurrent situation series is formalised as both a collection of situations occurring regularly over time and unified according to some properties that are common to all the members, and a situation itself, which provides a relational context to its members that satisfy a reference description. Besides including some exemplifying instances of this pattern, we show how it has been implemented and specialised to model recurrent cultural events and ceremonies in ArCo, the Knowledge Graph of Italian cultural heritage.

AINov 25, 2020

The Landscape of Ontology Reuse Approaches

Valentina Anita Carriero, Marilena Daquino, Aldo Gangemi et al.

Ontology reuse aims to foster interoperability and facilitate knowledge reuse. Several approaches are typically evaluated by ontology engineers when bootstrapping a new project. However, current practices are often motivated by subjective, case-by-case decisions, which hamper the definition of a recommended behaviour. In this chapter we argue that to date there are no effective solutions for supporting developers' decision-making process when deciding on an ontology reuse strategy. The objective is twofold: (i) to survey current approaches to ontology reuse, presenting motivations, strategies, benefits and limits, and (ii) to analyse two representative approaches and discuss their merits.

AIJul 9, 2020

A Reference Software Architecture for Social Robots

Luigi Asprino, Paolo Ciancarini, Andrea Giovanni Nuzzolese et al.

Social Robotics poses tough challenges to software designers who are required to take care of difficult architectural drivers like acceptability, trust of robots as well as to guarantee that robots establish a personalised interaction with their users. Moreover, in this context recurrent software design issues such as ensuring interoperability, improving reusability and customizability of software components also arise. Designing and implementing social robotic software architectures is a time-intensive activity requiring multi-disciplinary expertise: this makes difficult to rapidly develop, customise, and personalise robotic solutions. These challenges may be mitigated at design time by choosing certain architectural styles, implementing specific architectural patterns and using particular technologies. Leveraging on our experience in the MARIO project, in this paper we propose a series of principles that social robots may benefit from. These principles lay also the foundations for the design of a reference software architecture for Social Robots. The ultimate goal of this work is to establish a common ground based on a reference software architecture to allow to easily reuse robotic software components in order to rapidly develop, implement, and personalise Social Robots.

AINov 18, 2019

Pattern-based design applied to cultural heritage knowledge graphs

Valentina Anita Carriero, Aldo Gangemi, Maria Letizia Mancinelli et al.

Ontology Design Patterns (ODPs) have become an established and recognised practice for guaranteeing good quality ontology engineering. There are several ODP repositories where ODPs are shared as well as ontology design methodologies recommending their reuse. Performing rigorous testing is recommended as well for supporting ontology maintenance and validating the resulting resource against its motivating requirements. Nevertheless, it is less than straightforward to find guidelines on how to apply such methodologies for developing domain-specific knowledge graphs. ArCo is the knowledge graph of Italian Cultural Heritage and has been developed by using eXtreme Design (XD), an ODP- and test-driven methodology. During its development, XD has been adapted to the need of the CH domain e.g. gathering requirements from an open, diverse community of consumers, a new ODP has been defined and many have been specialised to address specific CH requirements. This paper presents ArCo and describes how to apply XD to the development and validation of a CH knowledge graph, also detailing the (intellectual) process implemented for matching the encountered modelling problems to ODPs. Relevant contributions also include a novel web tool for supporting unit-testing of knowledge graphs, a rigorous evaluation of ArCo, and a discussion of methodological lessons learned during ArCo development.

AISep 4, 2019

SQuAP-Ont: an Ontology of Software Quality Relational Factors from Financial Systems

Paolo Ciancarini, Andrea Giovanni Nuzzolese, Valentina Presutti et al.

Quality, architecture, and process are considered the keystones of software engineering. ISO defines them in three separate standards. However, their interaction has been scarcely studied, so far. The SQuAP model (Software Quality, Architecture, Process) describes twenty-eight main factors that impact on software quality in banking systems, and each factor is described as a relation among some characteristics from the three ISO standards. Hence, SQuAP makes such relations emerge rigorously, although informally. In this paper, we present SQuAP-Ont, an OWL ontology designed by following a well-established methodology based on the re-use of Ontology Design Patterns (i.e. ODPs). SQuAP-Ont formalises the relations emerging from SQuAP to represent and reason via Linked Data about software engineering in a three-dimensional model consisting of quality, architecture, and process ISO characteristics.

AIMay 7, 2019

ArCo: the Italian Cultural Heritage Knowledge Graph

Valentina Anita Carriero, Aldo Gangemi, Maria Letizia Mancinelli et al.

ArCo is the Italian Cultural Heritage knowledge graph, consisting of a network of seven vocabularies and 169 million triples about 820 thousand cultural entities. It is distributed jointly with a SPARQL endpoint, a software for converting catalogue records to RDF, and a rich suite of documentation material (testing, evaluation, how-to, examples, etc.). ArCo is based on the official General Catalogue of the Italian Ministry of Cultural Heritage and Activities (MiBAC) - and its associated encoding regulations - which collects and validates the catalogue records of (ideally) all Italian Cultural Heritage properties (excluding libraries and archives), contributed by CH administrators from all over Italy. We present its structure, design methods and tools, its growing community, and delineate its importance, quality, and impact.