CLOct 31, 2024

Responsible Retrieval Augmented Generation for Climate Decision Making from Documents

Matyas Juhasz, Kalyan Dutia, Henry Franks, Conor Delahunty, Patrick Fawbert Mills, Harrison Pim

arXiv:2410.23902v19 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the challenge of reliable information retrieval for climate decision-makers, but it is incremental as it builds on existing RAG methods with domain-specific adaptations.

The authors tackled the problem of making climate decision-making information more accessible from complex documents by addressing limitations in generative AI, such as hallucinations and domain-specific performance issues, through a novel evaluation framework and prototype tool for Retrieval-Augmented Generation (RAG) in climate law and policy, resulting in the publication of a dataset and tools to facilitate adoption.

Climate decision making is constrained by the complexity and inaccessibility of key information within lengthy, technical, and multi-lingual documents. Generative AI technologies offer a promising route for improving the accessibility of information contained within these documents, but suffer from limitations. These include (1) a tendency to hallucinate or mis-represent information, (2) difficulty in steering or guaranteeing properties of generated output, and (3) reduced performance in specific technical domains. To address these challenges, we introduce a novel evaluation framework with domain-specific dimensions tailored for climate-related documents. We then apply this framework to evaluate Retrieval-Augmented Generation (RAG) approaches and assess retrieval- and generation-quality within a prototype tool that answers questions about individual climate law and policy documents. In addition, we publish a human-annotated dataset and scalable automated evaluation tools, with the aim of facilitating broader adoption and robust assessment of these systems in the climate domain. Our findings highlight the key components of responsible deployment of RAG to enhance decision-making, while also providing insights into user experience (UX) considerations for safely deploying such systems to build trust with users in high-risk domains.

View on arXiv PDF

Similar