Experiments in Agentic AI for Science
For scientists and researchers, this work provides practical agentic AI systems that automate data curation and report generation, though the approach is incremental.
The paper introduces two agentic AI frameworks for scientific workflows: DeepTS/DeepCollector for automated time-series dataset curation and DeepScribe for converting physics lectures into structured reports, demonstrating practical systems engineering to overcome LLM limitations.
This paper details two novel frameworks for developing autonomous, agentic AI in scientific workflows. Both systems leverage a hybrid Local Body, Remote Brain architecture via Google Colab, utilizing Python-based local orchestrators to invoke large language model (LLM) cloud backends. The first agent, DeepTS/DeepCollector, automates the large-scale curation, extraction, and deduplication of time-series datasets. The second, DeepScribe, is an autonomous presentation analyzer that converts visually dense, mathematically complex physics lectures into structured scientific reports. Through practical systems engineering-such as granular attribute extraction (Cellular RAG), remote data inspection, and distributed concurrency controls-we demonstrate how agentic AI can overcome the context and reasoning limitations of current state-of-the-art systems to rigorously support scientific workflows. Finally, we outline a generalization of DeepTS to support deep knowledge graphs and discuss the application of this conceptual approach to high-energy physics (DeepQCD).