AI-assisted cultural heritage dissemination: Comparing NMT and glossary-augmented LLM translation in rock art documents
For cultural heritage institutions with limited budgets, the paper demonstrates a low-overhead method to improve terminology control in specialized translation, though the gains are incremental over existing LLM prompting.
The study compares three machine translation setups for Spanish-to-English translation of rock art documents, finding that glossary-augmented LLM prompting (Gemini-RAG) achieves the highest terminology accuracy (81.4%) while maintaining overall quality (mean DA 85.3), outperforming both a strong NMT baseline (DeepL, 64.4% accuracy, 80.3 DA) and a basic LLM prompt (Gemini-Simple, 69.1% accuracy, 85.2 DA).
Cultural heritage institutions increasingly disseminate research and interpretive materials globally, but multilingual dissemination is constrained by limited budgets and staffing. In terminology-dense domains such as rock art, translation quality depends on accurate, consistent specialised terms, and small lexical errors can mislead non-specialists and reduce reuse. We compare three English MT setups for a Spanish academic rock art text, focusing on simple, operationally feasible interventions rather than complex model-side modifications: (1) DeepL as a strong NMT baseline, (2) Gemini-Simple (LLM with a basic prompt), and (3) Gemini-RAG (the same LLM with glossary-augmented prompting via term-pair retrieval). Using PEARMUT, we conduct a human evaluation via (i) multi-way Direct Assessment (0--100) and (ii) targeted terminology auditing with a restricted MQM taxonomy. Gemini-RAG yields the highest exact-match terminology accuracy (81.4\%), versus Gemini-Simple (69.1\%) and DeepL (64.4\%), while preserving overall quality (mean DA 85.3 Gemini-RAG vs. 85.2 Gemini-Simple), outperforming DeepL (80.3). These results show that glossary-augmented prompting is a low-overhead way to improve terminology control in cultural-heritage translation if institutions maintain minimal terminology resources and lightweight evaluation procedures.