AIJul 6, 2020
Separating Positive and Negative Data Examples by Concepts and Formulas: The Case of Restricted SignaturesJean Christoph Jung, Carsten Lutz, Hadrien Pulcini et al.
We study the separation of positive and negative data examples in terms of description logic (DL) concepts and formulas of decidable FO fragments, in the presence of an ontology. In contrast to previous work, we add a signature that specifies a subset of the symbols from the data and ontology that can be used for separation. We consider weak and strong versions of the resulting problem that differ in how the negative examples are treated. Our main results are that (a projective form of) the weak version is decidable in $\mathcal{ALCI}$ while it is undecidable in the guarded fragment GF, the guarded negation fragment GNF, and the DL $\mathcal{ALCFIO}$, and that strong separability is decidable in $\mathcal{ALCI}$, GF, and GNF. We also provide (mostly tight) complexity bounds.
LOJul 3, 2020
Logical Separability of Labeled Data Examples under OntologiesJean Christoph Jung, Carsten Lutz, Hadrien Pulcini et al.
Finding a logical formula that separates positive and negative examples given in the form of labeled data items is fundamental in applications such as concept learning, reverse engineering of database queries, generating referring expressions, and entity comparison in knowledge graphs. In this paper, we investigate the existence of a separating formula for data in the presence of an ontology. Both for the ontology language and the separation language, we concentrate on first-order logic and the following important fragments thereof: the description logic $\mathcal{ALCI}$, the guarded fragment, the two-variable fragment, and the guarded negation fragment. For separation, we also consider (unions of) conjunctive queries. We consider several forms of separability that differ in the treatment of negative examples and in whether or not they admit the use of additional helper symbols to achieve separation. Our main results are model-theoretic characterizations of (all variants of) separability, the comparison of the separating power of different languages, and the investigation of the computational complexity of deciding separability.