IRAICLSep 16, 2025

Optimizing Agricultural Research: A RAG-Based Approach to Mycorrhizal Fungi Information

arXiv:2511.14765v1h-index: 1
Originality Synthesis-oriented
AI Analysis

This work addresses the need for efficient knowledge discovery in sustainable agriculture, offering a tool to enhance decision-making for farmers and researchers, though it is incremental as it applies an existing RAG method to a new domain.

The study tackled the problem of accessing and synthesizing information on arbuscular mycorrhizal fungi (AMF) for agricultural applications by developing a RAG-based system that retrieves and generates contextually accurate responses, demonstrating its ability to retrieve and synthesize highly relevant information on AMF interactions with crops like tomato.

Retrieval-Augmented Generation (RAG) represents a transformative approach within natural language processing (NLP), combining neural information retrieval with generative language modeling to enhance both contextual accuracy and factual reliability of responses. Unlike conventional Large Language Models (LLMs), which are constrained by static training corpora, RAG-powered systems dynamically integrate domain-specific external knowledge sources, thereby overcoming temporal and disciplinary limitations. In this study, we present the design and evaluation of a RAG-enabled system tailored for Mycophyto, with a focus on advancing agricultural applications related to arbuscular mycorrhizal fungi (AMF). These fungi play a critical role in sustainable agriculture by enhancing nutrient acquisition, improving plant resilience under abiotic and biotic stresses, and contributing to soil health. Our system operationalizes a dual-layered strategy: (i) semantic retrieval and augmentation of domain-specific content from agronomy and biotechnology corpora using vector embeddings, and (ii) structured data extraction to capture predefined experimental metadata such as inoculation methods, spore densities, soil parameters, and yield outcomes. This hybrid approach ensures that generated responses are not only semantically aligned but also supported by structured experimental evidence. To support scalability, embeddings are stored in a high-performance vector database, allowing near real-time retrieval from an evolving literature base. Empirical evaluation demonstrates that the proposed pipeline retrieves and synthesizes highly relevant information regarding AMF interactions with crop systems, such as tomato (Solanum lycopersicum). The framework underscores the potential of AI-driven knowledge discovery to accelerate agroecological innovation and enhance decision-making in sustainable farming systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes