Towards Standardization of Data Licenses: The Montreal Data License
This work addresses the problem of unclear data licensing for researchers and practitioners in AI/ML, potentially fostering fairer data markets, but it is incremental as it builds on existing concepts like open-source software licensing.
The paper tackles the lack of standardization in data licensing for AI/ML by proposing a taxonomy and the Montreal Data License (MDL) to create a common framework, aiming to increase transparency and resolve ambiguities in licensing language.
This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper's goal is to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In parallel, such benefits may help foster fairer and more efficient markets for data through bringing about clearer tools and concepts that better define how data can be used in the fields of AI and ML. The paper's approach is summarized in a new family of data license language - \textit{the Montreal Data License (MDL)}. Alongside this new license, the authors and their collaborators have developed a web-based tool to generate license language espousing the taxonomies articulated in this paper.