CLLGApr 8, 2025

The Zero Body Problem: Probing LLM Use of Sensory Language

arXiv:2504.06393v14 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses the problem of understanding AI's ability to mimic embodied human language for researchers in robotics, linguistics, and cognitive science, but it is incremental as it builds on an existing corpus and focuses on comparative analysis.

The study investigated whether language models can approximate human use of sensory language, finding that all tested models generate stories with significant differences from human usage, with variations in direction across model families, such as Gemini models using more sensory language and others using less.

Sensory language expresses embodied experiences ranging from taste and sound to excitement and stomachache. This language is of interest to scholars from a wide range of domains including robotics, narratology, linguistics, and cognitive science. In this work, we explore whether language models, which are not embodied, can approximate human use of embodied language. We extend an existing corpus of parallel human and model responses to short story prompts with an additional 18,000 stories generated by 18 popular models. We find that all models generate stories that differ significantly from human usage of sensory language, but the direction of these differences varies considerably between model families. Namely, Gemini models use significantly more sensory language than humans along most axes whereas most models from the remaining five families use significantly less. Linear probes run on five models suggest that they are capable of identifying sensory language. However, we find preliminary evidence suggesting that instruction tuning may discourage usage of sensory language. Finally, to support further work, we release our expanded story dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes