LGMEMLJun 3, 2025

Probabilistic Factorial Experimental Design for Combinatorial Interventions

arXiv:2506.03363v1h-index: 4ICML
Originality Highly original
AI Analysis

This addresses the scalability issue in experimental design for fields like biomedicine and engineering, offering a method to reduce the exponential number of interventions needed, though it is incremental in optimizing existing frameworks.

The paper tackles the problem of efficiently designing combinatorial interventions with many treatments by introducing probabilistic factorial experimental design, proving that a dosage of 1/2 per treatment is near-optimal and requiring O(kp^{3k} ln(p)) observations to estimate k-way interaction models.

A combinatorial intervention, consisting of multiple treatments applied to a single unit with potentially interactive effects, has substantial applications in fields such as biomedicine, engineering, and beyond. Given $p$ possible treatments, conducting all possible $2^p$ combinatorial interventions can be laborious and quickly becomes infeasible as $p$ increases. Here we introduce probabilistic factorial experimental design, formalized from how scientists perform lab experiments. In this framework, the experimenter selects a dosage for each possible treatment and applies it to a group of units. Each unit independently receives a random combination of treatments, sampled from a product Bernoulli distribution determined by the dosages. Additionally, the experimenter can carry out such experiments over multiple rounds, adapting the design in an active manner. We address the optimal experimental design problem within an intervention model that imposes bounded-degree interactions between treatments. In the passive setting, we provide a closed-form solution for the near-optimal design. Our results prove that a dosage of $\tfrac{1}{2}$ for each treatment is optimal up to a factor of $1+O(\tfrac{\ln(n)}{n})$ for estimating any $k$-way interaction model, regardless of $k$, and imply that $O\big(kp^{3k}\ln(p)\big)$ observations are required to accurately estimate this model. For the multi-round setting, we provide a near-optimal acquisition function that can be numerically optimized. We also explore several extensions of the design problem and finally validate our findings through simulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes