Collaborative Machine Learning Markets with Data-Replication-Robust Payments
This addresses fair revenue distribution and security in multi-party ML collaborations, offering a novel market design that is incremental in improving robustness.
The paper tackles the problem of designing collaborative machine learning markets that fairly distribute revenue and resist data replication threats, introducing a payment function and customized models to incentivize high-quality data submission, with experiments validating theoretical assumptions for common ML models.
We study the problem of collaborative machine learning markets where multiple parties can achieve improved performance on their machine learning tasks by combining their training data. We discuss desired properties for these machine learning markets in terms of fair revenue distribution and potential threats, including data replication. We then instantiate a collaborative market for cases where parties share a common machine learning task and where parties' tasks are different. Our marketplace incentivizes parties to submit high quality training and true validation data. To this end, we introduce a novel payment division function that is robust-to-replication and customized output models that perform well only on requested machine learning tasks. In experiments, we validate the assumptions underlying our theoretical analysis and show that these are approximately satisfied for commonly used machine learning models.