CL AISep 18, 2025

TextMine: Data, Evaluation Framework and Ontology-guided LLM Pipeline for Humanitarian Mine Action

Chenyue Zhou, Gürkan Solmaz, Flavio Cirillo, Kiril Gashteovski, Jonathan Fürst

arXiv:2509.15098v22.71 citationsh-index: 18

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of information transferability for humanitarian mine action agencies, though it is incremental as it applies existing LLM methods to a new domain-specific dataset.

The authors tackled the problem of extracting structured knowledge from unstructured humanitarian mine action reports by introducing TextMine, a dataset, evaluation framework, and ontology-guided LLM pipeline, which improved extraction accuracy by up to 44.2%, reduced hallucinations by 22.5%, and enhanced format adherence by 20.9% compared to baselines.

Humanitarian Mine Action (HMA) addresses the challenge of detecting and removing landmines from conflict regions. Much of the life-saving operational knowledge produced by HMA agencies is buried in unstructured reports, limiting the transferability of information between agencies. To address this issue, we propose TextMine: the first dataset, evaluation framework and ontology-guided large language model (LLM) pipeline for knowledge extraction in the HMA domain. TextMine structures HMA reports into (subject, relation, object)-triples, thus creating domain-specific knowledge. To ensure real-world relevance, we created the dataset in collaboration with Cambodian Mine Action Center (CMAC). We further introduce a bias-aware evaluation framework that combines human-annotated triples with an LLM-as-Judge protocol to mitigate position bias in reference-free scoring. Our experiments show that ontology-aligned prompts improve extraction accuracy by up to 44.2%, reduce hallucinations by 22.5%, and enhance format adherence by 20.9% compared to baseline models. We publicly release the dataset and code.

View on arXiv PDF

Similar