AICLNov 3, 2024

Ontology Population using LLMs

arXiv:2411.01612v110 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the costly and challenging task of knowledge graph population for data integration and representation, though it is incremental as it builds on existing LLM capabilities with prompt engineering.

The study tackled the problem of populating knowledge graphs from unstructured text by using Large Language Models (LLMs), achieving approximately 90% triple extraction accuracy when guided by a modular ontology.

Knowledge graphs (KGs) are increasingly utilized for data integration, representation, and visualization. While KG population is critical, it is often costly, especially when data must be extracted from unstructured text in natural language, which presents challenges, such as ambiguity and complex interpretations. Large Language Models (LLMs) offer promising capabilities for such tasks, excelling in natural language understanding and content generation. However, their tendency to ``hallucinate'' can produce inaccurate outputs. Despite these limitations, LLMs offer rapid and scalable processing of natural language data, and with prompt engineering and fine-tuning, they can approximate human-level performance in extracting and structuring data for KGs. This study investigates LLM effectiveness for the KG population, focusing on the Enslaved.org Hub Ontology. In this paper, we report that compared to the ground truth, LLM's can extract ~90% of triples, when provided a modular ontology as guidance in the prompts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes