CLNov 5, 2025

Kastor: Fine-tuned Small Language Models for Shape-based Active Relation Extraction

Ringwald Celian, Gandon Fabien, Faron Catherine, Michel Franck, Abi Akl Hanna

arXiv:2511.03466v11 citationsh-index: 5ESWC

Originality Incremental advance

AI Analysis

This work addresses the need for efficient knowledge base completion in specialized domains, representing an incremental improvement over existing RDF pattern-based extraction methods.

The authors tackled the problem of fine-tuning small language models for relation extraction by introducing Kastor, a framework that reformulates validation to evaluate all property combinations from SHACL shapes, resulting in enhanced model generalization and performance, with iterative learning to refine noisy knowledge bases and uncover new facts.

RDF pattern-based extraction is a compelling approach for fine-tuning small language models (SLMs) by focusing a relation extraction task on a specified SHACL shape. This technique enables the development of efficient models trained on limited text and RDF data. In this article, we introduce Kastor, a framework that advances this approach to meet the demands for completing and refining knowledge bases in specialized domains. Kastor reformulates the traditional validation task, shifting from single SHACL shape validation to evaluating all possible combinations of properties derived from the shape. By selecting the optimal combination for each training example, the framework significantly enhances model generalization and performance. Additionally, Kastor employs an iterative learning process to refine noisy knowledge bases, enabling the creation of robust models capable of uncovering new, relevant facts

View on arXiv PDF

Similar