THAIGTJul 19, 2021

Data Sharing Markets

arXiv:2107.08630v221 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient data sharing mechanisms in distributed ML, offering incremental improvements to existing market designs.

The paper tackles the problem of designing data markets for distributed machine learning by modeling agents as both buyers and sellers, proposing algorithms for stable bilateral data exchange and mechanisms for socially optimal unilateral exchange with private information. It introduces the mixed-VCG mechanism to achieve budget balance and truthfulness, extending results to incremental inquiries and differential privacy costs.

With the growing use of distributed machine learning techniques, there is a growing need for data markets that allows agents to share data with each other. Nevertheless data has unique features that separates it from other commodities including replicability, cost of sharing, and ability to distort. We study a setup where each agent can be both buyer and seller of data. For this setup, we consider two cases: bilateral data exchange (trading data with data) and unilateral data exchange (trading data with money). We model bilateral sharing as a network formation game and show the existence of strongly stable outcome under the top agents property by allowing limited complementarity. We propose ordered match algorithm which can find the stable outcome in O(N^2) (N is the number of agents). For the unilateral sharing, under the assumption of additive cost structure, we construct competitive prices that can implement any social welfare maximizing outcome. Finally for this setup when agents have private information, we propose mixed-VCG mechanism which uses zero cost data distortion of data sharing with its isolated impact to achieve budget balance while truthfully implementing socially optimal outcomes to the exact level of budget imbalance of standard VCG mechanisms. Mixed-VCG uses data distortions as data money for this purpose. We further relax zero cost data distortion assumption by proposing distorted-mixed-VCG. We also extend our model and results to data sharing via incremental inquiries and differential privacy costs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes