ML LG OCJan 19, 2012

Adaptive Policies for Sequential Sampling under Incomplete Information and a Cost Constraint

arXiv:1201.4002v16 citations

Originality Incremental advance

AI Analysis

This addresses a decision-making problem under uncertainty for applications like resource allocation or clinical trials, but it appears incremental as it builds on existing adaptive policy frameworks.

The paper tackles the problem of sequential sampling from unknown distributions to maximize long-term average outcomes under a cost constraint, and shows that their adaptive policies achieve convergence to the true optimal value with probability 1 for distributions with finite means.

We consider the problem of sequential sampling from a finite number of independent statistical populations to maximize the expected infinite horizon average outcome per period, under a constraint that the expected average sampling cost does not exceed an upper bound. The outcome distributions are not known. We construct a class of consistent adaptive policies, under which the average outcome converges with probability 1 to the true value under complete information for all distributions with finite means. We also compare the rate of convergence for various policies in this class using simulation.

View on arXiv PDF

Similar