Christian Peukert

h-index1
2papers

2 Papers

CYNov 3, 2023
The risks of risk-based AI regulation: taking liability seriously

Martin Kretschmer, Tobias Kretschmer, Alexander Peukert et al.

The development and regulation of multi-purpose, large "foundation models" of AI seems to have reached a critical stage, with major investments and new applications announced every other day. Some experts are calling for a moratorium on the training of AI systems more powerful than GPT-4. Legislators globally compete to set the blueprint for a new regulatory regime. This paper analyses the most advanced legal proposal, the European Union's AI Act currently in the stage of final "trilogue" negotiations between the EU institutions. This legislation will likely have extra-territorial implications, sometimes called "the Brussels effect". It also constitutes a radical departure from conventional information and communications technology policy by regulating AI ex-ante through a risk-based approach that seeks to prevent certain harmful outcomes based on product safety principles. We offer a review and critique, specifically discussing the AI Act's problematic obligations regarding data quality and human oversight. Our proposal is to take liability seriously as the key regulatory mechanism. This signals to industry that if a breach of law occurs, firms are required to know in particular what their inputs were and how to retrain the system to remedy the breach. Moreover, we suggest differentiating between endogenous and exogenous sources of potential harm, which can be mitigated by carefully allocating liability between developers and deployers of AI technology.

GNApr 29, 2024
AI and the Dynamic Supply of Training Data

Christian Peukert, Florian Abeillon, Jérémie Haese et al.

Artificial intelligence (AI) systems rely heavily on human-generated data, yet the people behind that data are often overlooked. Human behavior can play a major role in AI training datasets, be it in limiting access to existing works or in deciding which types of new works to create or whether to create any at all. We examine creators' behavioral change when their works become training data for commercial AI. Specifically, we focus on contributors on Unsplash, a popular stock image platform with about 6 million high-quality photos and illustrations. In the summer of 2020, Unsplash launched a research program and released a dataset of 25,000 images for commercial AI use. We study contributors' reactions, comparing contributors whose works were included in this dataset to contributors whose works were not. Our results suggest that treated contributors left the platform at a higher-than-usual rate and substantially slowed down the rate of new uploads. Professional photographers and more heavily affected users had a stronger reaction than amateurs and less affected users. We also show that affected users changed the variety and novelty of contributions to the platform, which can potentially lead to lower-quality AI outputs in the long run. Our findings highlight a critical trade-off: the drive to expand AI capabilities versus the incentives of those producing training data. We conclude with policy proposals, including dynamic compensation schemes and structured data markets, to realign incentives at the data frontier.