CLDLNov 3, 2025

"Don't Teach Minerva": Guiding LLMs Through Complex Syntax for Faithful Latin Translation with RAG

arXiv:2511.01454v1h-index: 1Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of accurate Latin translation for researchers and linguists, offering an open-source alternative to proprietary systems, though it is incremental as it builds on existing methods like RAG and fine-tuning.

The paper tackled the challenge of translating Latin, a morphology-rich, low-resource language, by introducing a draft-based refinement pipeline that uses a fine-tuned NLLB-1.3B model and zero-shot LLMs with RAG, achieving performance statistically comparable to GPT-5 on both in-domain and out-of-domain benchmarks.

Translating a morphology-rich, low-resource language like Latin poses significant challenges. This paper introduces a reproducible draft-based refinement pipeline that elevates open-source Large Language Models (LLMs) to a performance level statistically comparable to top-tier proprietary systems. Our method first uses a fine-tuned NLLB-1.3B model to generate a high-quality, structurally faithful draft. A zero-shot LLM (Llama-3.3 or Qwen3) then polishes this draft, a process that can be further enhanced by augmenting the context with retrieved out-context examples (RAG). We demonstrate the robustness of this approach on two distinct benchmarks: a standard in-domain test set (Rosenthal, 2023) and a new, challenging out-of-domain (OOD) set of 12th-century Latin letters (2025). Our central finding is that this open-source RAG system achieves performance statistically comparable to the GPT-5 baseline, without any task-specific LLM fine-tuning. We release the pipeline, the Chartres OOD set, and evaluation scripts and models to facilitate replicability and further research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes