CLAIFeb 17, 2025

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

arXiv:2502.11614v22 citationsh-index: 47
Originality Incremental advance
AI Analysis

This work addresses the problem of AI text detection and human preference for researchers and practitioners, providing evidence that detection is more feasible than previously thought, though it is incremental in refining existing knowledge.

The study challenged prior findings by showing that humans can distinguish AI-generated text from human-written text with 87.6% accuracy across 16 datasets in 9 languages and domains, identifying gaps in concreteness, cultural nuances, and diversity, and found that humans do not always prefer human-written text when the source is unclear.

Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domains, 19 annotators achieved an average detection accuracy of 87.6\%, thus challenging previous conclusions. We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity. Prompting by explicitly explaining the distinctions in the prompts can partially bridge the gaps in over 50\% of the cases. However, we also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes