Lattice-preserving $\mathcal{ALC}$ ontology embeddings with saturation
This work addresses a gap in ontology embedding for bioinformatics and related fields by handling more expressive logics without requiring individuals, though it is incremental as it builds on existing semantic-preserving methods.
The paper tackles the problem of generating vector embeddings for OWL ontologies in the expressive Description Logic ALC, which often lack individuals, by proposing a method that preserves the lattice structure of concept descriptions using Category Theory. The result shows that this method outperforms state-of-the-art approaches in knowledge base completion tasks.
Generating vector representations (embeddings) of OWL ontologies is a growing task due to its applications in predicting missing facts and knowledge-enhanced learning in fields such as bioinformatics. The underlying semantics of OWL ontologies are expressed using Description Logics (DLs). Initial approaches to generate embeddings relied on constructing a graph out of ontologies, neglecting the semantics of the logic therein. Recent semantic-preserving embedding methods often target lightweight DL languages like $\mathcal{EL}^{++}$, ignoring more expressive information in ontologies. Although some approaches aim to embed more descriptive DLs like $\mathcal{ALC}$, those methods require the existence of individuals, while many real-world ontologies are devoid of them. We propose an ontology embedding method for the $\mathcal{ALC}$ DL language that considers the lattice structure of concept descriptions. We use connections between DL and Category Theory to materialize the lattice structure and embed it using an order-preserving embedding method. We show that our method outperforms state-of-the-art methods in several knowledge base completion tasks. Furthermore, we incoporate saturation procedures that increase the information within the constructed lattices. We make our code and data available at \url{https://github.com/bio-ontology-research-group/catE}.