HCAIMar 16

Bridging the Experimental Last Mile: Digitizing Laboratory Know-How for Safe AI-Assisted Support

arXiv:2604.1634529.4h-index: 24
Predicted impact top 63% in HC · last 90 daysOriginality Incremental advance
AI Analysis

This addresses the need for safer and more reliable AI support in educational and exploratory laboratory settings, though it is incremental as it builds on existing multimodal AI and RAG techniques.

The study tackled the problem of digitizing laboratory know-how for safe AI-assisted support in human-led experiments by developing a human-in-the-loop AI assistant that extracts site-specific knowledge from video data and provides grounded responses, with evaluation showing utility scores of 3.25/4.00 and safety scores of 4.00/4.00.

Advances in Materials Informatics have accelerated the development of Self-Driving Laboratories (SDLs), yet human-led experiments remain standard in many educational and exploratory research settings. In such environments, practical know-how, including operational details and site-specific rules, is essential for safe and reliable laboratory work. In this proof-of-concept study, we developed a human-in-the-loop AI assistant that combines first-person experimental video, multimodal AI, and retrieval-augmented generation (RAG). Using powder X-ray diffraction experiments and student-recorded video data as inputs, the system extracts site-specific laboratory knowledge from recorded procedures, including physical techniques and audible confirmation that conventional manuals could omit. It then provides grounded responses based on the resulting manual. To reduce the risk of unsupported outputs, the system employs a two-layer safety design: source restriction through RAG and strict system-prompt constraints. Instructor-based evaluation showed alignment with expected guidance for questions covered by the manual. For out-of-scope queries, the system appropriately refused to answer, indicating a reduced risk of hallucination. Expert evaluation further indicated that the generated advisory reports were useful and safe (utility: 3.25/4.00; safety: 4.00/4.00). These results suggest a framework in which AI supports laboratory practice under explicit human supervision rather than replacing human judgment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes