Reinforcement Learning for Molecular Dynamics Optimization: A Stochastic Pontryagin Maximum Principle Approach
This work addresses molecular optimization for drug discovery and molecular design, but it is incremental as it builds on existing reinforcement learning and physics-based methods.
The paper tackles the problem of optimizing molecular dynamics by focusing on entire trajectories rather than final configurations, using a reinforcement learning framework with a stochastic Pontryagin Maximum Principle and Soft Actor-Critic, and demonstrates competitive performance on six molecules like Bradykinin and Oxytocin against unsupervised physics-based methods.
In this paper, we present a novel reinforcement learning framework designed to optimize molecular dynamics by focusing on the entire trajectory rather than just the final molecular configuration. Leveraging a stochastic version of Pontryagin's Maximum Principle (PMP) and Soft Actor-Critic (SAC) algorithm, our framework effectively explores non-convex molecular energy landscapes, escaping local minima to stabilize in low-energy states. Our approach operates in continuous state and action spaces without relying on labeled data, making it applicable to a wide range of molecular systems. Through extensive experimentation on six distinct molecules, including Bradykinin and Oxytocin, we demonstrate competitive performance against other unsupervised physics-based methods, such as the Greedy and NEMO-based algorithms. Our method's adaptability and focus on dynamic trajectory optimization make it suitable for applications in areas such as drug discovery and molecular design.