The Impact of AI-Generated Text on the Internet

Jonas Dolezal, Sawood Alam, Mark Graham, Maty Bohacek

arXiv:2604.2696597.01 citations

AI Analysis

For researchers and policymakers concerned about the impact of AI on internet content quality, this provides the first large-scale empirical estimate of AI text prevalence and tests common hypotheses about its effects.

The study estimates that by mid-2025, roughly 35% of newly published websites are AI-generated or AI-assisted, up from zero before ChatGPT's launch. It finds that increases in AI-generated text correlate with decreased semantic diversity and increased positive sentiment, but not with decreased factual accuracy or stylistic diversity, contrary to public perception.

The proliferation of AI-generated and AI-assisted text on the internet is feared to contribute to a degradation in semantic and stylistic diversity, factual accuracy, and other negative developments (sometimes subsumed under the Dead Internet Theory). What has hindered answering these questions is that it has not been understood just how much of the internet is actually AI-generated or AI-edited. To this end, we construct a representative sample of websites published on the internet between 2022 and 2025 using the Internet Archive, and apply a state-of-the-art AI text detector on them. We find that by mid-2025, roughly 35% of newly published websites were classified as AI-generated or AI-assisted, up from zero before ChatGPT's launch in late 2022. We also find statistically significant evidence for some of the identified hypotheses; for example, that increases in AI-generated text on the internet correlate negatively with semantic diversity and positively with the prevalence of positive sentiment. We do not, however, find statistically significant evidence supporting the hypothesis that an increased rate of AI-generated text on the internet decreases factual accuracy or stylistic diversity. Notably, this diverges from public perception, which we measure in a user study, where the majority of US adults turned out to believe in all four of the above-mentioned hypotheses. Individuals who do not use AI or use it infrequently tend to believe in these negative impacts more than those who use it frequently; similarly, individuals who hold negative views of AI tend to believe in these hypotheses more than those with favorable views of the technology.

View on arXiv PDF

Similar