Bayesian Optimal Active Search and Surveying
This work addresses active learning problems where generalization error is secondary, offering theoretical and practical improvements for applications like data mining or survey design, though it is incremental in extending prior active search work.
The paper tackles active binary-classification problems with atypical objectives—active search to uncover class members and active surveying to predict class proportions—by deriving optimal policies via Bayesian decision theory. It proves that less-myopic approximations can outperform more-myopic ones arbitrarily and provides bounds to reduce the exponential search space for optimal decisions.
We consider two active binary-classification problems with atypical objectives. In the first, active search, our goal is to actively uncover as many members of a given class as possible. In the second, active surveying, our goal is to actively query points to ultimately predict the proportion of a given class. Numerous real-world problems can be framed in these terms, and in either case typical model-based concerns such as generalization error are only of secondary importance. We approach these problems via Bayesian decision theory; after choosing natural utility functions, we derive the optimal policies. We provide three contributions. In addition to introducing the active surveying problem, we extend previous work on active search in two ways. First, we prove a novel theoretical result, that less-myopic approximations to the optimal policy can outperform more-myopic approximations by any arbitrary degree. We then derive bounds that for certain models allow us to reduce (in practice dramatically) the exponential search space required by a naive implementation of the optimal policy, enabling further lookahead while still ensuring that optimal decisions are always made.