Adaptive Sampling for Estimating Distributions: A Bayesian Upper Confidence Bound Approach
This work offers an improved adaptive sampling strategy for researchers and public health officials needing to estimate probability distributions, such as disease prevalence, more efficiently and accurately.
This paper addresses the problem of adaptively sampling to estimate probability mass functions (pmfs) uniformly well, aiming to minimize worst-case mean squared error. The authors propose a Bayesian Upper Confidence Bound (UCB) approach, demonstrating analytically that its performance is at least as good as existing methods. This method was applied to estimate SARS-CoV-2 seroprevalence, showing significant practical performance gains.
The problem of adaptive sampling for estimating probability mass functions (pmf) uniformly well is considered. Performance of the sampling strategy is measured in terms of the worst-case mean squared error. A Bayesian variant of the existing upper confidence bound (UCB) based approaches is proposed. It is shown analytically that the performance of this Bayesian variant is no worse than the existing approaches. The posterior distribution on the pmfs in the Bayesian setting allows for a tighter computation of upper confidence bounds which leads to significant performance gains in practice. Using this approach, adaptive sampling protocols are proposed for estimating SARS-CoV-2 seroprevalence in various groups such as location and ethnicity. The effectiveness of this strategy is discussed using data obtained from a seroprevalence survey in Los Angeles county.