Incentivizing Exploration with Selective Data Disclosure
This work addresses the challenge of sequential social learning in recommendation systems, offering a solution that reduces assumptions on agent rationality and commitment.
The paper tackles the problem of incentivizing efficient exploration in recommendation systems by designing a system that presents agents with selected past data, achieving asymptotically optimal regret rates for exploration.
We propose and design recommendation systems that incentivize efficient exploration. Agents arrive sequentially, choose actions and receive rewards, drawn from fixed but unknown action-specific distributions. The recommendation system presents each agent with actions and rewards from a subsequence of past agents, chosen ex ante. Thus, the agents engage in sequential social learning, moderated by these subsequences. We asymptotically attain optimal regret rate for exploration, using a flexible frequentist behavioral model and mitigating rationality and commitment assumptions inherent in prior work. We suggest three components of effective recommendation systems: independent focus groups, group aggregators, and interlaced information structures.