LGMLMar 7, 2025

Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design

arXiv:2503.05905v22 citationsh-index: 9
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of algorithm selection for researchers in experimental design, but it is incremental as it builds on existing reinforcement learning applications without introducing new methods.

The study compared reinforcement learning algorithms for sequential experimental design to identify which produce agents that make maximally informative decisions, finding that algorithms using dropout or ensemble approaches show better generalization properties.

Recent developments in sequential experimental design look to construct a policy that can efficiently navigate the design space, in a way that maximises the expected information gain. Whilst there is work on achieving tractable policies for experimental design problems, there is significantly less work on obtaining policies that are able to generalise well - i.e. able to give good performance despite a change in the underlying statistical properties of the experiments. Conducting experiments sequentially has recently brought about the use of reinforcement learning, where an agent is trained to navigate the design space to select the most informative designs for experimentation. However, there is still a lack of understanding about the benefits and drawbacks of using certain reinforcement learning algorithms to train these agents. In our work, we investigate several reinforcement learning algorithms and their efficacy in producing agents that take maximally informative design decisions in sequential experimental design scenarios. We find that agent performance is impacted depending on the algorithm used for training, and that particular algorithms, using dropout or ensemble approaches, empirically showcase attractive generalisation properties.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes