Biomedical Hypothesis Explainability with Graph-Based Context Retrieval
This addresses the need for explainable AI in biomedical hypothesis generation, though it appears incremental as it builds on existing retrieval-augmented generation frameworks.
The paper tackles the problem of explaining biomedical hypotheses by introducing a method that combines graph-based retrieval with retrieval-augmented generation to provide contextual evidence from scientific literature, and it demonstrates performance through expert and automated evaluations.
We introduce an explainability method for biomedical hypothesis generation systems, built on top of the novel Hypothesis Generation Context Retriever framework. Our approach combines semantic graph-based retrieval and relevant data-restrictive training to simulate real-world discovery constraints. Integrated with large language models (LLMs) via retrieval-augmented generation, the system explains hypotheses with contextual evidence using published scientific literature. We also propose a novel feedback loop approach, which iteratively identifies and corrects flawed parts of LLM-generated explanations, refining both the evidence paths and supporting context. We demonstrate the performance of our method with multiple large language models and evaluate the explanation and context retrieval quality through both expert-curated assessment and large-scale automated analysis. Our code is available at: https://github.com/IlyaTyagin/HGCR.