Strategyproof Learning: Building Trustworthy User-Generated Datasets
This addresses the critical safety issue for any learning scheme using user-generated data by ensuring trustworthy datasets.
The paper tackles the problem of incentivizing truthful data reporting in user-generated datasets by proposing Licchavi, a learning framework with provable strategyproofness guarantees that prevents users from gaining by misreporting preferences, while also promoting fairness principles.
We prove in this paper that, perhaps surprisingly, incentivizing data misreporting is not a fatality. By leveraging a careful design of the loss function, we propose Licchavi, a global and personalized learning framework with provable strategyproofness guarantees. Essentially, we prove that no user can gain much by replying to Licchavi's queries with answers that deviate from their true preferences. Interestingly, Licchavi also promotes the desirable "one person, one unit-force vote" fairness principle. Furthermore, our empirical evaluation of its performance showcases Licchavi's real-world applicability. We believe that our results are critical for the safety of any learning scheme that leverages user-generated data.