Towards Discovery of Polymers for Insulin Delivery via Physics-Grounded Agentic Workflows
For researchers in drug delivery and polymer design, this work provides a method to navigate large chemical spaces efficiently, though it is domain-specific and incremental in combining existing tools.
The paper introduces an agentic workflow combining LLMs with physics-based tools to discover polymers for insulin delivery, achieving an insulin-polymer interaction energy of -2263 kJ/mol, outperforming RL baselines by 68% and Bayesian optimization by 19%.
Cold-chain storage limits access to insulin for hundreds of millions of people; a thermally protective patch polymer could help, but the design space is too large for exhaustive experiment. Starting from that problem, we narrow to an agentic workflow: a large language model (LLM) calls physics-based tools through the Model Context Protocol (MCP), searching the discrete PSMILES space under a budget of OpenMM Packmol-matrix evaluations. The LLM acts as an implicit acquisition function conditioned on a persistent "discovery world": hypotheses, literature claims, and simulation outcomes updated each iteration. Under matched oracle budgets, the best autonomous campaign reaches an insulin-polymer interaction energy of -2263 kJ/mol, outperforming reinforcement-learning baselines by 68% and Bayesian optimization by 19%. Three independent campaigns converge on one structural motif (dense hydrogen-bond donors and acceptors per repeat unit) while physics checks reject infeasible packings and name-structure mismatches before they steer the next step. The science stage is CPU-bound and runs on commodity hardware. More broadly, the same architecture and workflow designed here applies to other protein-stabilization tasks whenever a tractable screening oracle is available.