46.4AIApr 1Code
IDEA2: Expert-in-the-loop competency question elicitation for collaborative ontology engineeringElliott Watkiss-Leek, Reham Alharbi, Harry Rostron et al.
Competency question (CQ) elicitation represents a critical but resource-intensive bottleneck in ontology engineering. This foundational phase is often hampered by the communication gap between domain experts, who possess the necessary knowledge, and ontology engineers, who formalise it. This paper introduces IDEA2, a novel, semi-automated workflow that integrates Large Language Models (LLMs) within a collaborative, expert-in-the-loop process to address this challenge. The methodology is characterised by a core iterative loop: an initial LLM-based extraction of CQs from requirement documents, a co-creational review and feedback phase by domain experts on an accessible collaborative platform, and an iterative, feedback-driven reformulation of rejected CQs by an LLM until consensus is achieved. To ensure transparency and reproducibility, the entire lifecycle of each CQ is tracked using a provenance model that captures the full lineage of edits, anonymised feedback, and generation parameters. The workflow was validated in 2 real-world scenarios (scientific data, cultural heritage), demonstrating that IDEA2 can accelerate the requirements engineering process, improve the acceptance and relevance of the resulting CQs, and exhibit high usability and effectiveness among domain experts. We release all code and experiments at https://github.com/KE-UniLiv/IDEA2
18.2AIApr 2
The AnIML Ontology: Enabling Semantic Interoperability for Large-Scale Experimental Data in Interconnected Scientific LabsWilf Morlidge, Elliott Watkiss-Leek, George Hannah et al.
Achieving semantic interoperability across heterogeneous experimental data systems remains a major barrier to data-driven scientific discovery. The Analytical Information Markup Language (AnIML), a flexible XML-based standard for analytical chemistry and biology, is increasingly used in industrial R&D labs for managing and exchanging experimental data. However, the expressivity of the XML schema permits divergent interpretations across stakeholders, introducing inconsistencies that undermine the interoperability the AnIML schema was designed to support. In this paper, we present the AnIML Ontology, an OWL 2 ontology that formalises the semantics of AnIML and aligns it with the Allotrope Data Format to support future cross-system and cross-lab interoperability. The ontology was developed using an expert-in-the-loop approach combining LLM-assisted requirement elicitation with collaborative ontology engineering. We validate the ontology through a multi-layered approach: data-driven transformation of real-world AnIML files into knowledge graphs, competency question verification via SPARQL, and a novel validation protocol based on adversarial negative competency questions mapped to established ontological anti-patterns and enforced via SHACL constraints.
CVNov 5, 2023
MirrorCalib: Utilizing Human Pose Information for Mirror-based Virtual Camera CalibrationLongyun Liao, Rong Zheng, Andrew Mitchell
In this paper, we present the novel task of estimating the extrinsic parameters of a virtual camera relative to a real camera in exercise videos with a mirror. This task poses a significant challenge in scenarios where the views from the real and mirrored cameras have no overlap or share salient features. To address this issue, prior knowledge of a human body and 2D joint locations are utilized to estimate the camera extrinsic parameters when a person is in front of a mirror. We devise a modified eight-point algorithm to obtain an initial estimation from 2D joint locations. The 2D joint locations are then refined subject to human body constraints. Finally, a RANSAC algorithm is employed to remove outliers by comparing their epipolar distances to a predetermined threshold. MirrorCalib achieves a rotation error of 1.82° and a translation error of 69.51 mm on a collected real-world dataset, which outperforms the state-of-art method.
AIJan 30, 2025
Broadening Ontologization Design: Embracing Data Pipeline StrategiesChris Partridge, Andrew Mitchell, Sergio de Cesare et al.
Our aim in this paper is to outline how the design space for the ontologization process is broader than current practice would suggest. We point out that engineering processes as well as products need to be designed and identify some components of the design. We investigate the possibility of designing a range of radically new practices implemented as data pipelines, providing examples of the new practices from our work over the last three decades with an outlier methodology, bCLEARer. We also suggest that setting an evolutionary context for ontologization helps one to better understand the nature of these new practices and provides the conceptual scaffolding that shapes fertile processes. Where this evolutionary perspective positions digitalization (the evolutionary emergence of computing technologies) as the latest step in a long evolutionary trail of information transitions. This reframes ontologization as a strategic tool for leveraging the emerging opportunities offered by digitalization.
DBSep 1, 2025
Disentangling the schema turn: Restoring the information base to conceptual modellingChris Partridge, Andrew Mitchell, Sergio de Cesare et al.
If one looks at contemporary mainstream development practices for conceptual modelling in computer science, these so clearly focus on a conceptual schema completely separated from its information base that the conceptual schema is often just called the conceptual model. These schema-centric practices are crystallized in almost every database textbook. We call this strong, almost universal, bias towards conceptual schemas the schema turn. The focus of this paper is on disentangling this turn within (computer science) conceptual modeling. It aims to shed some light on how it emerged and so show that it is not fundamental. To show that modern technology enables the adoption of an inclusive schema-and-base conceptual modelling approach, which in turn enables more automated, and empirically motivated practices. And to show, more generally, the space of possible conceptual modelling practices is wider than currently assumed. It also uses the example of bCLEARer to show that the implementations in this wider space will probably need to rely on new pipeline-based conceptual modelling techniques. So, it is possible that the schema turn's complete exclusion of the information base could be merely a temporary evolutionary detour.
AIJul 4, 2025
RELRaE: LLM-Based Relationship Extraction, Labelling, Refinement, and EvaluationGeorge Hannah, Jacopo de Berardinis, Terry R. Payne et al.
A large volume of XML data is produced in experiments carried out by robots in laboratories. In order to support the interoperability of data between labs, there is a motivation to translate the XML data into a knowledge graph. A key stage of this process is the enrichment of the XML schema to lay the foundation of an ontology schema. To achieve this, we present the RELRaE framework, a framework that employs large language models in different stages to extract and accurately label the relationships implicitly present in the XML schema. We investigate the capability of LLMs to accurately generate these labels and then evaluate them. Our work demonstrates that LLMs can be effectively used to support the generation of relationship labels in the context of lab automation, and that they can play a valuable role within semi-automatic ontology generation frameworks more generally.