44.2AIApr 1
Reducing Hallucinations in LLM-based Scientific Literature Analysis Using Peer Context Outlier DetectionDaniel Xie, Maxwell J. Jacobson, Adil Wazeer et al.
Reducing hallucinations in Large Language Models (LLMs) is essential for improving the accuracy of data extraction from large text corpora. Current methods, like prompt engineering and chain-of-thought prompting, focus on individual documents but fail to consider relationships across a corpus. This paper introduces Peer Context Outlier Detection (P-COD), a novel approach that uses the relationships between documents to improve extraction accuracy. Our application domain is in scientific literature summarization, where papers with similar experiment settings should draw similar conclusions. By comparing extracted data to validated peer information within the corpus, we adjust confidence scores and flag low-confidence results for expert review. High-confidence results, supported by peer validation, are considered reliable. Our experiments demonstrate up to 98% precision in outlier detection across 6 domains of science, demonstrating that our design reduces hallucinations, enhances trust in automated systems, and allows researchers to focus on ambiguous cases, streamlining the data extraction workflows.
CVSep 13, 2023
End-to-end Phase Field Model Discovery Combining Experimentation, Crowdsourcing, Simulation and LearningMd Nasim, Anter El-Azab, Xinghang Zhang et al.
The availability of tera-byte scale experiment data calls for AI driven approaches which automatically discover scientific models from data. Nonetheless, significant challenges present in AI-driven scientific discovery: (i) The annotation of large scale datasets requires fundamental re-thinking in developing scalable crowdsourcing tools. (ii) The learning of scientific models from data calls for innovations beyond black-box neural nets. (iii) Novel visualization and diagnosis tools are needed for the collaboration of experimental and theoretical physicists, and computer scientists. We present Phase-Field-Lab platform for end-to-end phase field model discovery, which automatically discovers phase field physics models from experiment data, integrating experimentation, crowdsourcing, simulation and learning. Phase-Field-Lab combines (i) a streamlined annotation tool which reduces the annotation time (by ~50-75%), while increasing annotation accuracy compared to baseline; (ii) an end-to-end neural model which automatically learns phase field models from data by embedding phase field simulation and existing domain knowledge into learning; and (iii) novel interfaces and visualizations to integrate our platform into the scientific discovery cycle of domain scientists. Our platform is deployed in the analysis of nano-structure evolution in materials under extreme conditions (high temperature and irradiation). Our approach reveals new properties of nano-void defects, which otherwise cannot be detected via manual analysis.
27.7AIApr 1
A Multi-Agent Human-LLM Collaborative Framework for Closed-Loop Scientific Literature SummarizationMaxwell J. Jacobson, Daniel Xie, Jackson Shen et al.
Scientific discovery is slowed by fragmented literature that requires excessive human effort to gather, analyze, and understand. AI tools, including autonomous summarization and question answering, have been developed to aid in understanding scientific literature. However, these tools lack the structured, multi-step approach necessary for extracting deep insights from scientific literature. Large Language Models (LLMs) offer new possibilities for literature analysis, but remain unreliable due to hallucinations and incomplete extraction. We introduce Elhuyar, a multi-agent, human-in-the-loop system that integrates LLMs, structured AI, and human scientists to extract, analyze, and iteratively refine insights from scientific literature. The framework distributes tasks among specialized agents for filtering papers, extracting data, fitting models, and summarizing findings, with human oversight ensuring reliability. The system generates structured reports with extracted data, visualizations, model equations, and text summaries, enabling deeper inquiry through iterative refinement. Deployed in materials science, it analyzed literature on tungsten under helium-ion irradiation, showing experimentally correlated exponential helium bubble growth with irradiation dose and temperature, offering insight for plasma-facing materials (PFMs) in fusion reactors. This demonstrates how AI-assisted literature review can uncover scientific patterns and accelerate discovery.