Learning Principle of Least Action with Reinforcement Learning
This work provides an incremental approach for physicists and AI researchers to model physical phenomena using reinforcement learning, specifically by connecting it to the principle of least action.
This paper explores using reinforcement learning to discover physical trajectories by setting rewards based on the action integral. They successfully demonstrated this by training a Q-learning agent to find the minimal-time path for light propagation in varying refractive index materials, recovering results consistent with Snell's Law and Fermat's Principle.
Nature provides a way to understand physics with reinforcement learning since nature favors the economical way for an object to propagate. In the case of classical mechanics, nature favors the object to move along the path according to the integral of the Lagrangian, called the action $\mathcal{S}$. We consider setting the reward/penalty as a function of $\mathcal{S}$, so the agent could learn the physical trajectory of particles in various kinds of environments with reinforcement learning. In this work, we verified the idea by using a Q-Learning based algorithm on learning how light propagates in materials with different refraction indices, and show that the agent could recover the minimal-time path equivalent to the solution obtained by Snell's law or Fermat's Principle. We also discuss the similarity of our reinforcement learning approach to the path integral formalism.