AIJul 18, 2025

Language Models as Ontology Encoders

Oxford
arXiv:2507.14334v12 citationsh-index: 12Has CodeSemWeb
Originality Incremental advance
AI Analysis

This work addresses limitations in ontology embeddings for domains like healthcare and bioinformatics, offering an incremental improvement over existing methods by better integrating text and structure.

The paper tackles the problem of embedding OWL ontologies by proposing OnT, a method that combines pretrained language models with hyperbolic geometric modeling to incorporate textual labels while preserving logical structure, achieving state-of-the-art performance on prediction and inference tasks across four real-world ontologies.

OWL (Web Ontology Language) ontologies which are able to formally represent complex knowledge and support semantic reasoning have been widely adopted across various domains such as healthcare and bioinformatics. Recently, ontology embeddings have gained wide attention due to its potential to infer plausible new knowledge and approximate complex reasoning. However, existing methods face notable limitations: geometric model-based embeddings typically overlook valuable textual information, resulting in suboptimal performance, while the approaches that incorporate text, which are often based on language models, fail to preserve the logical structure. In this work, we propose a new ontology embedding method OnT, which tunes a Pretrained Language Model (PLM) via geometric modeling in a hyperbolic space for effectively incorporating textual labels and simultaneously preserving class hierarchies and other logical relationships of Description Logic EL. Extensive experiments on four real-world ontologies show that OnT consistently outperforms the baselines including the state-of-the-art across both tasks of prediction and inference of axioms. OnT also demonstrates strong potential in real-world applications, indicated by its robust transfer learning abilities and effectiveness in real cases of constructing a new ontology from SNOMED CT. Data and code are available at https://github.com/HuiYang1997/OnT.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes