RoD-TAL: A Benchmark for Answering Questions in Romanian Driving License Exams
This addresses the need for AI tools in legal education for under-resourced languages like Romanian, though it is incremental as it applies existing methods to a new dataset.
The researchers tackled the problem of evaluating AI models for understanding Romanian driving law by creating RoD-TAL, a multimodal dataset of driving test questions, and found that domain-specific fine-tuning improved retrieval while reasoning methods boosted question-answering accuracy above passing exam grades, though visual reasoning remained difficult.
The intersection of AI and legal systems presents a growing need for tools that support legal education, particularly in under-resourced languages such as Romanian. In this work, we aim to evaluate the capabilities of Large Language Models (LLMs) and Vision-Language Models (VLMs) in understanding and reasoning about Romanian driving law through textual and visual question-answering tasks. To facilitate this, we introduce RoD-TAL, a novel multimodal dataset comprising Romanian driving test questions, text-based and image-based, alongside annotated legal references and human explanations. We implement and assess retrieval-augmented generation (RAG) pipelines, dense retrievers, and reasoning-optimized models across tasks including Information Retrieval (IR), Question Answering (QA), Visual IR, and Visual QA. Our experiments demonstrate that domain-specific fine-tuning significantly enhances retrieval performance. At the same time, chain-of-thought prompting and specialized reasoning models improve QA accuracy, surpassing the minimum grades required to pass driving exams. However, visual reasoning remains challenging, highlighting the potential and the limitations of applying LLMs and VLMs to legal education.