IRApr 6

DisastRAG: A Multi-Source Disaster Information Integration and Access System Based on Retrieval-Augmented Large Language Models

arXiv:2605.0521063.8h-index: 8Has Code
AI Analysis

For disaster management practitioners, this system addresses the need for heterogeneous information access, but the gains are incremental over existing RAG approaches.

DisastRAG integrates structured, unstructured, and external disaster information via a multi-path retrieval-augmented LLM system, achieving 12-23 percentage point gains on multiple-choice tasks and up to 10.5 percentage point gains on open-ended tasks over no-retrieval baselines.

Effective disaster management requires rapid access to information distributed across structured operational records, unstructured institutional documents, and dynamic external sources. However, most existing disaster information systems and retrieval-augmented generation frameworks remain organized around a single access pathway, limiting their ability to support heterogeneous, time-sensitive, and context-dependent information needs. This study presents DisastRAG, a disaster-aware information integration and access system that combines large language models with retrieval-augmented access to structured, unstructured, and contextual disaster information. The framework is built around a multi-path architecture that supports document retrieval over a curated hazard corpus, structured access over relational disaster records, and external web fallback for out-of-corpus requests, while also incorporating query understanding, strategy routing, response generation, and contextual memory within a unified system. We evaluated the document retrieval performance using four open-source large language models across multiple retrieval configurations on multiple-choice and open-ended disaster information tasks. Retrieval augmentation consistently improves performance over no-retrieval baselines, yielding multiple-choice gains of 12-23 percentage points and open-ended keypoint coverage gains of up to 10.5 percentage points. Results show that larger candidate pools are most helpful for weaker models, while stronger models are more sensitive to retrieval noise. Hybrid retrieval performs best for open-ended coverage, whereas vector retrieval and shallower reranking more often favor closed-form factual selection. Case studies further show that structured access and web fallback extend the framework beyond document-only RAG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes