LG MLOct 21, 2022

Targeted active learning for probabilistic models

Christopher Tosh, Mauricio Tec, Wesley Tansey

arXiv:2210.12122v14.62 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient experiment design for scientists, particularly in fields like drug discovery, by providing a method that focuses on specific scientific goals rather than general information gain, though it appears incremental as it builds on existing active learning and probabilistic modeling frameworks.

The authors tackled the problem of designing experiments to maximize scientific utility by introducing PDBAL, a targeted active learning method that adaptively selects experiments based on a user-specified risk function and probabilistic model, resulting in faster convergence and outperforming standard untargeted approaches in simulations and a cancer drug screen study where it recovered the most efficacious drugs with a small fraction of experiments.

A fundamental task in science is to design experiments that yield valuable insights about the system under study. Mathematically, these insights can be represented as a utility or risk function that shapes the value of conducting each experiment. We present PDBAL, a targeted active learning method that adaptively designs experiments to maximize scientific utility. PDBAL takes a user-specified risk function and combines it with a probabilistic model of the experimental outcomes to choose designs that rapidly converge on a high-utility model. We prove theoretical bounds on the label complexity of PDBAL and provide fast closed-form solutions for designing experiments with common exponential family likelihoods. In simulation studies, PDBAL consistently outperforms standard untargeted approaches that focus on maximizing expected information gain over the design space. Finally, we demonstrate the scientific potential of PDBAL through a study on a large cancer drug screen dataset where PDBAL quickly recovers the most efficacious drugs with a small fraction of the total number of experiments.

View on arXiv PDF Code

Similar