Model-based Pricing for Machine Learning in a Data Marketplace
This addresses the issue of data cost inefficiency for participants in data marketplaces, offering a novel pricing approach that is incremental in applying noise injection to existing methods.
The paper tackles the problem of high data acquisition costs in machine learning by proposing a model-based pricing framework that directly prices ML model instances instead of data, showing through experiments that it can provide high revenue to sellers and affordability to buyers with low runtime cost.
Data analytics using machine learning (ML) has become ubiquitous in science, business intelligence, journalism and many other domains. While a lot of work focuses on reducing the training cost, inference runtime and storage cost of ML models, little work studies how to reduce the cost of data acquisition, which potentially leads to a loss of sellers' revenue and buyers' affordability and efficiency. In this paper, we propose a model-based pricing (MBP) framework, which instead of pricing the data, directly prices ML model instances. We first formally describe the desired properties of the MBP framework, with a focus on avoiding arbitrage. Next, we show a concrete realization of the MBP framework via a noise injection approach, which provably satisfies the desired formal properties. Based on the proposed framework, we then provide algorithmic solutions on how the seller can assign prices to models under different market scenarios (such as to maximize revenue). Finally, we conduct extensive experiments, which validate that the MBP framework can provide high revenue to the seller, high affordability to the buyer, and also operate on low runtime cost.