LGAICYHCMay 4, 2023

Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era

arXiv:2305.02555v27 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge for AI developers and data owners in creating a sustainable and mutually beneficial relationship in the AI era, though it is incremental as it builds on existing business model concepts.

The paper tackles the problem of AI tools like ChatGPT needing high-quality data but facing copyright limitations, proposing a new revenue-sharing business model with data providers to foster collaboration and a healthy AI ecosystem. It introduces a prompt-based scoring system using classification and similarity models to measure data engagement and encourage participation.

With various AI tools such as ChatGPT becoming increasingly popular, we are entering a true AI era. We can foresee that exceptional AI tools will soon reap considerable profits. A crucial question arise: should AI tools share revenue with their training data providers in additional to traditional stakeholders and shareholders? The answer is Yes. Large AI tools, such as large language models, always require more and better quality data to continuously improve, but current copyright laws limit their access to various types of data. Sharing revenue between AI tools and their data providers could transform the current hostile zero-sum game relationship between AI tools and a majority of copyrighted data owners into a collaborative and mutually beneficial one, which is necessary to facilitate the development of a virtuous cycle among AI tools, their users and data providers that drives forward AI technology and builds a healthy AI ecosystem. However, current revenue-sharing business models do not work for AI tools in the forthcoming AI era, since the most widely used metrics for website-based traffic and action, such as clicks, will be replaced by new metrics such as prompts and cost per prompt for generative AI tools. A completely new revenue-sharing business model, which must be almost independent of AI tools and be easily explained to data providers, needs to establish a prompt-based scoring system to measure data engagement of each data provider. This paper systematically discusses how to build such a scoring system for all data providers for AI tools based on classification and content similarity models, and outlines the requirements for AI tools or third parties to build it. Sharing revenue with data providers using such a scoring system would encourage more data owners to participate in the revenue-sharing program. This will be a utilitarian AI era where all parties benefit.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes