Bandit Data-Driven Optimization
This work addresses iterative optimization problems for non-profit and public sector applications, offering a novel framework but with incremental improvements over existing methods.
The paper tackles the challenge of applying machine learning in non-profit and public sectors by addressing pain points like small data and unmodeled objectives, introducing bandit data-driven optimization with the PROOF algorithm, which achieves no-regret and superior performance in simulations and a food rescue case study.
Applications of machine learning in the non-profit and public sectors often feature an iterative workflow of data acquisition, prediction, and optimization of interventions. There are four major pain points that a machine learning pipeline must overcome in order to be actually useful in these settings: small data, data collected only under the default intervention, unmodeled objectives due to communication gap, and unforeseen consequences of the intervention. In this paper, we introduce bandit data-driven optimization, the first iterative prediction-prescription framework to address these pain points. Bandit data-driven optimization combines the advantages of online bandit learning and offline predictive analytics in an integrated framework. We propose PROOF, a novel algorithm for this framework and formally prove that it has no-regret. Using numerical simulations, we show that PROOF achieves superior performance than existing baseline. We also apply PROOF in a detailed case study of food rescue volunteer recommendation, and show that PROOF as a framework works well with the intricacies of ML models in real-world AI for non-profit and public sector applications.