ML IT LGFeb 24, 2017

Bayes-Optimal Entropy Pursuit for Active Choice-Based Preference Learning

Stephen N. Pallone, Peter I. Frazier, Shane G. Henderson

arXiv:1702.07694v11.81 citations

Originality Incremental advance

AI Analysis

This work addresses preference learning for individual users in active settings, but it is incremental as it builds on existing Bayesian and entropy-based methods.

The paper tackles the problem of actively learning a user's preferences through choice-based queries, showing that a greedy policy for reducing posterior entropy is Bayes-optimal and achieves a linear lower bound under certain noise conditions, with numerical comparisons to a knowledge gradient policy.

We analyze the problem of learning a single user's preferences in an active learning setting, sequentially and adaptively querying the user over a finite time horizon. Learning is conducted via choice-based queries, where the user selects her preferred option among a small subset of offered alternatives. These queries have been shown to be a robust and efficient way to learn an individual's preferences. We take a parametric approach and model the user's preferences through a linear classifier, using a Bayesian prior to encode our current knowledge of this classifier. The rate at which we learn depends on the alternatives offered at every time epoch. Under certain noise assumptions, we show that the Bayes-optimal policy for maximally reducing entropy of the posterior distribution of this linear classifier is a greedy policy, and that this policy achieves a linear lower bound when alternatives can be constructed from the continuum. Further, we analyze a different metric called misclassification error, proving that the performance of the optimal policy that minimizes misclassification error is bounded below by a linear function of differential entropy. Lastly, we numerically compare the greedy entropy reduction policy with a knowledge gradient policy under a number of scenarios, examining their performance under both differential entropy and misclassification error.

View on arXiv PDF

Similar