Bernardo Cuenca Grau

AI
h-index15
28papers
523citations
Novelty49%
AI Score49

28 Papers

LGMay 19, 2022
Cardinality-Minimal Explanations for Monotonic Neural Networks

Ouns El Harzli, Bernardo Cuenca Grau, Ian Horrocks · oxford

In recent years, there has been increasing interest in explanation methods for neural model predictions that offer precise formal guarantees. These include abductive (respectively, contrastive) methods, which aim to compute minimal subsets of input features that are sufficient for a given prediction to hold (respectively, to change a given prediction). The corresponding decision problems are, however, known to be intractable. In this paper, we investigate whether tractability can be regained by focusing on neural models implementing a monotonic function. Although the relevant decision problems remain intractable, we can show that they become solvable in polynomial time by means of greedy algorithms if we additionally assume that the activation functions are continuous everywhere and differentiable almost everywhere. Our experiments suggest favourable performance of our algorithms.

AIJun 7, 2023
Revisiting Inferential Benchmarks for Knowledge Graph Completion

Shuwen Liu, Bernardo Cuenca Grau, Ian Horrocks et al. · oxford

Knowledge Graph (KG) completion is the problem of extending an incomplete KG with missing facts. A key feature of Machine Learning approaches for KG completion is their ability to learn inference patterns, so that the predicted facts are the results of applying these patterns to the KG. Standard completion benchmarks, however, are not well-suited for evaluating models' abilities to learn patterns, because the training and test sets of these benchmarks are a random split of a given KG and hence do not capture the causality of inference patterns. We propose a novel approach for designing KG completion benchmarks based on the following principles: there is a set of logical rules so that the missing facts are the results of the rules' application; the training set includes both premises matching rule antecedents and the corresponding conclusions; the test set consists of the results of applying the rules to the training set; the negative examples are designed to discourage the models from learning rules not entailed by the rule set. We use our methodology to generate several benchmarks and evaluate a wide range of existing KG completion systems. Our results provide novel insights on the ability of existing models to induce inference patterns from incomplete KGs.

LGJun 1, 2022
An Empirical Study of Retrieval-enhanced Graph Neural Networks

Dingmin Wang, Shengchao Liu, Hanchen Wang et al. · stanford

Graph Neural Networks (GNNs) are effective tools for graph representation learning. Most GNNs rely on a recursive neighborhood aggregation scheme, named message passing, thereby their theoretical expressive power is limited to the first-order Weisfeiler-Lehman test (1-WL). An effective approach to this challenge is to explicitly retrieve some annotated examples used to enhance GNN models. While retrieval-enhanced models have been proved to be effective in many language and vision domains, it remains an open question how effective retrieval-enhanced GNNs are when applied to graph datasets. Motivated by this, we want to explore how the retrieval idea can help augment the useful information learned in the graph neural networks, and we design a retrieval-enhanced scheme called GRAPHRETRIEVAL, which is agnostic to the choice of graph neural network models. In GRAPHRETRIEVAL, for each input graph, similar graphs together with their ground-true labels are retrieved from an existing database. Thus they can act as a potential enhancement to complete various graph property predictive tasks. We conduct comprehensive experiments over 13 datasets, and we observe that GRAPHRETRIEVAL is able to reach substantial improvements over existing GNNs. Moreover, our empirical study also illustrates that retrieval enhancement is a promising remedy for alleviating the long-tailed label distribution problem.

DBAug 15, 2022
Seminaive Materialisation in DatalogMTL

Dingmin Wang, Przemysław Andrzej Wałęga, Bernardo Cuenca Grau

DatalogMTL is an extension of Datalog with metric temporal operators that has found applications in temporal ontology-based data access and query answering, as well as in stream reasoning. Practical algorithms for DatalogMTL are reliant on materialisation-based reasoning, where temporal facts are derived in a forward chaining manner in successive rounds of rule applications. Current materialisation-based procedures are, however, based on a naive evaluation strategy, where the main source of inefficiency stems from redundant computations. In this paper, we propose a materialisation-based procedure which, analogously to the classical seminaive algorithm in Datalog, aims at minimising redundant computation by ensuring that each temporal rule instance is considered at most once during the execution of the algorithm. Our experiments show that our optimised seminaive strategy for DatalogMTL is able to significantly reduce materialisation times.

LGMay 21
What are the Right Symmetries for Formal Theorem Proving?

Krzysztof Olejniczak, Radoslav Dimitrov, Xingyue Huang et al.

Formal theorem provers based on large language models (LLMs) are highly sensitive to superficial variations in problem representation: semantically equivalent statements can exhibit drastically different proof success rates, revealing a failure to respect structural symmetries inherent in formal mathematics. This raises a central question: what are the right symmetries for formal theorem proving? We introduce rewriting categories, a category-theoretic framework capturing the compositional, generally non-invertible transformations induced by proof tactics, and use it to formalize two symmetry notions: proof equivariance, governing how proof distributions transform under rewrites, and success invariance (i.e., invariance of success probability), requiring equivalent statements to be solved with the same probability. We observe that state-based next-tactic provers naturally satisfy proof equivariance by operating on proof states. In contrast, state-of-the-art LLM-based provers satisfy neither property, exhibiting large performance variation across equivalent formulations. To mitigate this, we propose test-time methods that aggregate over equivalent rewritings of the input, showing theoretically that they recover success invariance in the sampling limit, and empirically, that they improve robustness and performance under fixed inference budgets. Our results highlight symmetry as a key missing inductive bias in LLM-based theorem proving and suggest test-time computation as a practical route to approximate it.

LGAug 14, 2024
Relational Graph Convolutional Networks Do Not Learn Sound Rules

Matthew Morris, David J. Tena Cucala, Bernardo Cuenca Grau et al.

Graph neural networks (GNNs) are frequently used to predict missing facts in knowledge graphs (KGs). Motivated by the lack of explainability for the outputs of these models, recent work has aimed to explain their predictions using Datalog, a widely used logic-based formalism. However, such work has been restricted to certain subclasses of GNNs. In this paper, we consider one of the most popular GNN architectures for KGs, R-GCN, and we provide two methods to extract rules that explain its predictions and are sound, in the sense that each fact derived by the rules is also predicted by the GNN, for any input dataset. Furthermore, we provide a method that can verify that certain classes of Datalog rules are not sound for the R-GCN. In our experiments, we train R-GCNs on KG completion benchmarks, and we are able to verify that no Datalog rule is sound for these models, even though the models often obtain high to near-perfect accuracy. This raises some concerns about the ability of R-GCN models to generalise and about the explainability of their predictions. We further provide two variations to the training paradigm of R-GCN that encourage it to learn sound rules and find a trade-off between model accuracy and the number of learned sound rules.

AIMay 12, 2025
The Correspondence Between Bounded Graph Neural Networks and Fragments of First-Order Logic

Bernardo Cuenca Grau, Eva Feng, Przemysław A. Wałęga

Graph Neural Networks (GNNs) address two key challenges in applying deep learning to graph-structured data: they handle varying size input graphs and ensure invariance under graph isomorphism. While GNNs have demonstrated broad applicability, understanding their expressive power remains an important question. In this paper, we propose GNN architectures that correspond precisely to prominent fragments of first-order logic (FO), including various modal logics as well as more expressive two-variable fragments. To establish these results, we apply methods from finite model theory of first-order and modal logics to the domain of graph representation learning. Our results provide a unifying framework for understanding the logical expressiveness of GNNs within FO.

AISep 28, 2025
From Neural Networks to Logical Theories: The Correspondence between Fibring Modal Logics and Fibring Neural Networks

Ouns El Harzli, Bernardo Cuenca Grau, Artur d'Avila Garcez et al.

Fibring of modal logics is a well-established formalism for combining countable families of modal logics into a single fibred language with common semantics, characterized by fibred models. Inspired by this formalism, fibring of neural networks was introduced as a neurosymbolic framework for combining learning and reasoning in neural networks. Fibring of neural networks uses the (pre-)activations of a trained network to evaluate a fibring function computing the weights of another network whose outputs are injected back into the original network. However, the exact correspondence between fibring of neural networks and fibring of modal logics was never formally established. In this paper, we close this gap by formalizing the idea of fibred models \emph{compatible} with fibred neural networks. Using this correspondence, we then derive non-uniform logical expressiveness results for Graph Neural Networks (GNNs), Graph Attention Networks (GATs) and Transformer encoders. Longer-term, the goal of this paper is to open the way for the use of fibring as a formalism for interpreting the logical theories learnt by neural networks with the tools of computational logic.

LGAug 14, 2025
Logical Expressivity and Explanations for Monotonic GNNs with Scoring Functions

Matthew Morris, David J. Tena Cucala, Bernardo Cuenca Grau

Graph neural networks (GNNs) are often used for the task of link prediction: predicting missing binary facts in knowledge graphs (KGs). To address the lack of explainability of GNNs on KGs, recent works extract Datalog rules from GNNs with provable correspondence guarantees. The extracted rules can be used to explain the GNN's predictions; furthermore, they can help characterise the expressive power of various GNN models. However, these works address only a form of link prediction based on a restricted, low-expressivity graph encoding/decoding method. In this paper, we consider a more general and popular approach for link prediction where a scoring function is used to decode the GNN output into fact predictions. We show how GNNs and scoring functions can be adapted to be monotonic, use the monotonicity to extract sound rules for explaining predictions, and leverage existing results about the kind of rules that scoring functions can capture. We also define procedures for obtaining equivalent Datalog programs for certain classes of monotonic GNNs with scoring functions. Our experiments show that, on link prediction benchmarks, monotonic GNNs and scoring functions perform well in practice and yield many sound rules.

AIMay 29, 2023
On the Correspondence Between Monotonic Max-Sum GNNs and Datalog

David Tena Cucala, Bernardo Cuenca Grau, Boris Motik et al.

Although there has been significant interest in applying machine learning techniques to structured data, the expressivity (i.e., a description of what can be learned) of such techniques is still poorly understood. In this paper, we study data transformations based on graph neural networks (GNNs). First, we note that the choice of how a dataset is encoded into a numeric form processable by a GNN can obscure the characterisation of a model's expressivity, and we argue that a canonical encoding provides an appropriate basis. Second, we study the expressivity of monotonic max-sum GNNs, which cover a subclass of GNNs with max and sum aggregation functions. We show that, for each such GNN, one can compute a Datalog program such that applying the GNN to any dataset produces the same facts as a single round of application of the program's rules to the dataset. Monotonic max-sum GNNs can sum an unbounded number of feature vectors which can result in arbitrarily large feature values, whereas rule application requires only a bounded number of constants. Hence, our result shows that the unbounded summation of monotonic max-sum GNNs does not increase their expressive power. Third, we sharpen our result to the subclass of monotonic max GNNs, which use only the max aggregation function, and identify a corresponding class of Datalog programs.

AIJan 12, 2022
MeTeoR: Practical Reasoning in Datalog with Metric Temporal Operators

Dingmin Wang, Pan Hu, Przemysław Andrzej Wałęga et al.

DatalogMTL is an extension of Datalog with operators from metric temporal logic which has received significant attention in recent years. It is a highly expressive knowledge representation language that is well-suited for applications in temporal ontology-based query answering and stream processing. Reasoning in DatalogMTL is, however, of high computational complexity, making implementation challenging and hindering its adoption in applications. In this paper, we present a novel approach for practical reasoning in DatalogMTL which combines materialisation (a.k.a. forward chaining) with automata-based techniques. We have implemented this approach in a reasoner called MeTeoR and evaluated its performance using a temporal extension of the Lehigh University Benchmark and a benchmark based on real-world meteorological data. Our experiments show that MeTeoR is a scalable system which enables reasoning over complex temporal rules and datasets involving tens of millions of temporal facts.

MLFeb 14, 2021
Double-descent curves in neural networks: a new perspective using Gaussian processes

Ouns El Harzli, Bernardo Cuenca Grau, Guillermo Valle-Pérez et al.

Double-descent curves in neural networks describe the phenomenon that the generalisation error initially descends with increasing parameters, then grows after reaching an optimal number of parameters which is less than the number of data points, but then descends again in the overparameterized regime. In this paper, we use techniques from random matrix theory to characterize the spectral distribution of the empirical feature covariance matrix as a width-dependent perturbation of the spectrum of the neural network Gaussian process (NNGP) kernel, thus establishing a novel connection between the NNGP literature and the random matrix theory literature in the context of neural networks. Our analytical expression allows us to study the generalisation behavior of the corresponding kernel and GP regression, and provides a new interpretation of the double-descent phenomenon, namely as governed by the discrepancy between the width-dependent empirical kernel and the width-independent NNGP kernel.

AIAug 7, 2018
The Window Validity Problem in Rule-Based Stream Reasoning

Alessandro Ronca, Mark Kaminski, Bernardo Cuenca Grau et al.

Rule-based temporal query languages provide the expressive power and flexibility required to capture in a natural way complex analysis tasks over streaming data. Stream processing applications, however, typically require near real-time response using limited resources. In particular, it becomes essential that the underpinning query language has favourable computational properties and that stream processing algorithms are able to keep only a small number of previously received facts in memory at any point in time without sacrificing correctness. In this paper, we propose a recursive fragment of temporal Datalog with tractable data complexity and study the properties of a generic stream reasoning algorithm for this fragment. We focus on the window validity problem as a way to minimise the number of time points for which the stream reasoning algorithm needs to keep data in memory at any point in time.

AIMay 3, 2018
Consequence-based Reasoning for Description Logics with Disjunction, Inverse Roles, Number Restrictions, and Nominals

David Tena Cucala, Bernardo Cuenca Grau, Ian Horrocks

We present a consequence-based calculus for concept subsumption and classification in the description logic ALCHOIQ, which extends ALC with role hierarchies, inverse roles, number restrictions, and nominals. By using standard transformations, our calculus extends to SROIQ, which covers all of OWL 2 DL except for datatypes. A key feature of our calculus is its pay-as-you-go behaviour: unlike existing algorithms, our calculus is worst-case optimal for all the well-known proper fragments of ALCHOIQ, albeit not for the full logic.

AIApr 25, 2018
Stratified Negation in Limit Datalog Programs

Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev et al.

There has recently been an increasing interest in declarative data analysis, where analytic tasks are specified using a logical language, and their implementation and optimisation are delegated to a general-purpose query engine. Existing declarative languages for data analysis can be formalised as variants of logic programming equipped with arithmetic function symbols and/or aggregation, and are typically undecidable. In prior work, the language of $\mathit{limit\ programs}$ was proposed, which is sufficiently powerful to capture many analysis tasks and has decidable entailment problem. Rules in this language, however, do not allow for negation. In this paper, we study an extension of limit programs with stratified negation-as-failure. We show that the additional expressive power makes reasoning computationally more demanding, and provide tight data complexity bounds. We also identify a fragment with tractable data complexity and sufficient expressivity to capture many relevant tasks.

AINov 10, 2017
Stream Reasoning in Temporal Datalog

Alessandro Ronca, Mark Kaminski, Bernardo Cuenca Grau et al.

In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities. This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received, as well as on data that arrived far in the past. Stream reasoning algorithms, however, must be able to stream out query answers as soon as possible, and can only keep a limited number of previous input facts in memory. In this paper, we propose novel reasoning problems to deal with these challenges, and study their computational properties on Datalog extended with a temporal sort and the successor function (a core rule-based language for stream reasoning applications).

AIMay 19, 2017
The Bag Semantics of Ontology-Based Data Access

Charalampos Nikolaou, Egor V. Kostylev, George Konstantinidis et al.

Ontology-based data access (OBDA) is a popular approach for integrating and querying multiple data sources by means of a shared ontology. The ontology is linked to the sources using mappings, which assign views over the data to ontology predicates. Motivated by the need for OBDA systems supporting database-style aggregate queries, we propose a bag semantics for OBDA, where duplicate tuples in the views defined by the mappings are retained, as is the case in standard databases. We show that bag semantics makes conjunctive query answering in OBDA coNP-hard in data complexity. To regain tractability, we consider a rather general class of queries and show its rewritability to a generalisation of the relational calculus to bags.

AIMay 19, 2017
Foundations of Declarative Data Analysis Using Limit Datalog Programs

Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev et al.

Motivated by applications in declarative data analysis, we study $\mathit{Datalog}_{\mathbb{Z}}$---an extension of positive Datalog with arithmetic functions over integers. This language is known to be undecidable, so we propose two fragments. In $\mathit{limit}~\mathit{Datalog}_{\mathbb{Z}}$ predicates are axiomatised to keep minimal/maximal numeric values, allowing us to show that fact entailment is coNExpTime-complete in combined, and coNP-complete in data complexity. Moreover, an additional $\mathit{stability}$ requirement causes the complexity to drop to ExpTime and PTime, respectively. Finally, we show that stable $\mathit{Datalog}_{\mathbb{Z}}$ can express many useful data analysis tasks, and so our results provide a sound foundation for the development of advanced information systems.

AIFeb 14, 2016
Extending Consequence-Based Reasoning to SRIQ

Andrew Bate, Boris Motik, Bernardo Cuenca Grau et al.

Consequence-based calculi are a family of reasoning algorithms for description logics (DLs), and they combine hypertableau and resolution in a way that often achieves excellent performance in practice. Up to now, however, they were proposed for either Horn DLs (which do not support disjunction), or for DLs without counting quantifiers. In this paper we present a novel consequence-based calculus for SRIQ---a rich DL that supports both features. This extension is non-trivial since the intermediate consequences that need to be derived during reasoning cannot be captured using DLs themselves. The results of our preliminary performance evaluation suggest the feasibility of our approach in practice.

AIApr 24, 2015
Controlled Query Evaluation for Datalog and OWL 2 Profile Ontologies

Bernardo Cuenca Grau, Evgeny Kharlamov, Egor V. Kostylev et al.

We study confidentiality enforcement in ontologies under the Controlled Query Evaluation framework, where a policy specifies the sensitive information and a censor ensures that query answers that may compromise the policy are not returned. We focus on censors that ensure confidentiality while maximising information access, and consider both Datalog and the OWL 2 profiles as ontology languages.

AIApr 20, 2015
Computing Horn Rewritings of Description Logics Ontologies

Mark Kaminski, Bernardo Cuenca Grau

We study the problem of rewriting an ontology O1 expressed in a DL L1 into an ontology O2 in a Horn DL L2 such that O1 and O2 are equisatisfiable when extended with an arbitrary dataset. Ontologies that admit such rewritings are amenable to reasoning techniques ensuring tractability in data complexity. After showing undecidability whenever L1 extends ALCF, we focus on devising efficiently checkable conditions that ensure existence of a Horn rewriting. By lifting existing techniques for rewriting Disjunctive Datalog programs into plain Datalog to the case of arbitrary first-order programs with function symbols, we identify a class of ontologies that admit Horn rewritings of polynomial size. Our experiments indicate that many real-world ontologies satisfy our sufficient conditions and thus admit polynomial Horn rewritings.

AINov 19, 2014
Ontology Module Extraction via Datalog Reasoning

Ana Armas Romero, Mark Kaminski, Bernardo Cuenca Grau et al.

Module extraction - the task of computing a (preferably small) fragment M of an ontology T that preserves entailments over a signature S - has found many applications in recent years. Extracting modules of minimal size is, however, computationally hard, and often algorithmically infeasible. Thus, practical techniques are based on approximations, where M provably captures the relevant entailments, but is not guaranteed to be minimal. Existing approximations, however, ensure that M preserves all second-order entailments of T w.r.t. S, which is stronger than is required in many applications, and may lead to large modules in practice. In this paper we propose a novel approach in which module extraction is reduced to a reasoning problem in datalog. Our approach not only generalises existing approximations in an elegant way, but it can also be tailored to preserve only specific kinds of entailments, which allows us to extract significantly smaller modules. An evaluation on widely-used ontologies has shown very encouraging results.

AIApr 11, 2014
Datalog Rewritability of Disjunctive Datalog Programs and its Applications to Ontology Reasoning

Mark Kaminski, Yavor Nenov, Bernardo Cuenca Grau

We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog---a novel rule-based KR language that extends both datalog and linear disjunctive datalog and for which reasoning is tractable in data complexity. We then explore applications of weakly linear programs to ontology reasoning and propose a tractable extension of OWL 2 RL with disjunctive axioms. Our empirical results suggest that many non-Horn ontologies can be reduced to weakly linear programs and that query answering over such ontologies using a datalog engine is feasible in practice.

DBFeb 4, 2014
Acyclicity Notions for Existential Rules and Their Application to Query Answering in Ontologies

Bernardo Cuenca Grau, Ian Horrocks, Markus Krötzsch et al.

Answering conjunctive queries (CQs) over a set of facts extended with existential rules is a prominent problem in knowledge representation and databases. This problem can be solved using the chase algorithm, which extends the given set of facts with fresh facts in order to satisfy the rules. If the chase terminates, then CQs can be evaluated directly in the resulting set of facts. The chase, however, does not terminate necessarily, and checking whether the chase terminates on a given set of rules and facts is undecidable. Numerous acyclicity notions were proposed as sufficient conditions for chase termination. In this paper, we present two new acyclicity notions called model-faithful acyclicity (MFA) and model-summarising acyclicity (MSA). Furthermore, we investigate the landscape of the known acyclicity notions and establish a complete taxonomy of all notions known to us. Finally, we show that MFA and MSA generalise most of these notions. Existential rules are closely related to the Horn fragments of the OWL 2 ontology language; furthermore, several prominent OWL 2 reasoners implement CQ answering by using the chase to materialise all relevant facts. In order to avoid termination problems, many of these systems handle only the OWL 2 RL profile of OWL 2; furthermore, some systems go beyond OWL 2 RL, but without any termination guarantees. In this paper we also investigate whether various acyclicity notions can provide a principled and practical solution to these problems. On the theoretical side, we show that query answering for acyclic ontologies is of lower complexity than for general ontologies. On the practical side, we show that many of the commonly used OWL 2 ontologies are MSA, and that the number of facts obtained by materialisation is not too large. Our results thus suggest that principled development of materialisation-based OWL 2 reasoners is practically feasible.

AIJan 23, 2014
Reasoning over Ontologies with Hidden Content: The Import-by-Query Approach

Bernardo Cuenca Grau, Boris Motik

There is currently a growing interest in techniques for hiding parts of the signature of an ontology Kh that is being reused by another ontology Kv. Towards this goal, in this paper we propose the import-by-query framework, which makes the content of Kh accessible through a limited query interface. If Kv reuses the symbols from Kh in a certain restricted way, one can reason over Kv U Kh by accessing only Kv and the query interface. We map out the landscape of the import-by-query problem. In particular, we outline the limitations of our framework and prove that certain restrictions on the expressivity of Kh and the way in which Kv reuses symbols from Kh are strictly necessary to enable reasoning in our setting. We also identify cases in which reasoning is possible and we present suitable import-by-query reasoning algorithms.

AIJan 18, 2014
Completeness Guarantees for Incomplete Ontology Reasoners: Theory and Practice

Bernardo Cuenca Grau, Boris Motik, Giorgos Stoilos et al.

To achieve scalability of query answering, the developers of Semantic Web applications are often forced to use incomplete OWL 2 reasoners, which fail to derive all answers for at least one query, ontology, and data set. The lack of completeness guarantees, however, may be unacceptable for applications in areas such as health care and defence, where missing answers can adversely affect the applications functionality. Furthermore, even if an application can tolerate some level of incompleteness, it is often advantageous to estimate how many and what kind of answers are being lost. In this paper, we present a novel logic-based framework that allows one to check whether a reasoner is complete for a given query Q and ontology T---that is, whether the reasoner is guaranteed to compute all answers to Q w.r.t. T and an arbitrary data set A. Since ontologies and typical queries are often fixed at application design time, our approach allows application developers to check whether a reasoner known to be incomplete in general is actually complete for the kinds of input relevant for the application. We also present a technique that, given a query Q, an ontology T, and reasoners R_1 and R_2 that satisfy certain assumptions, can be used to determine whether, for each data set A, reasoner R_1 computes more answers to Q w.r.t. T and A than reasoner R_2. This allows application developers to select the reasoner that provides the highest degree of completeness for Q and T that is compatible with the applications scalability requirements. Our results thus provide a theoretical and practical foundation for the design of future ontology-based information systems that maximise scalability while minimising or even eliminating incompleteness of query answers.

AIApr 4, 2013
Computing Datalog Rewritings beyond Horn Ontologies

Bernardo Cuenca Grau, Boris Motik, Giorgos Stoilos et al.

Rewriting-based approaches for answering queries over an OWL 2 DL ontology have so far been developed mainly for Horn fragments of OWL 2 DL. In this paper, we study the possibilities of answering queries over non-Horn ontologies using datalog rewritings. We prove that this is impossible in general even for very simple ontology languages, and even if PTIME = NP. Furthermore, we present a resolution-based procedure for $\SHI$ ontologies that, in case it terminates, produces a datalog rewriting of the ontology. Our procedure necessarily terminates on DL-Lite_{bool}^H ontologies---an extension of OWL 2 QL with transitive roles and Boolean connectives.

AIAug 15, 2012
Evaluating Ontology Matching Systems on Large, Multilingual and Real-world Test Cases

Christian Meilicke, Ondrej Sváb-Zamazal, Cássia Trojahn et al.

In the field of ontology matching, the most systematic evaluation of matching systems is established by the Ontology Alignment Evaluation Initiative (OAEI), which is an annual campaign for evaluating ontology matching systems organized by different groups of researchers. In this paper, we report on the results of an intermediary OAEI campaign called OAEI 2011.5. The evaluations of this campaign are divided in five tracks. Three of these tracks are new or have been improved compared to previous OAEI campaigns. Overall, we evaluated 18 matching systems. We discuss lessons learned, in terms of scalability, multilingual issues and the ability do deal with real world cases from different domains.