Semantic Density Effect (SDE): Maximizing Information Per Token Improves LLM Accuracy
For practitioners using LLMs, SDE offers a simple, zero-cost method to improve prompt effectiveness by removing low-information tokens.
The paper introduces the Semantic Density Effect (SDE), showing that prompts with higher semantic information per token improve LLM accuracy and reduce hallucinations. Ultra-dense prompts (SDE > 0.80) outperform diluted ones by +8.4 percentage points on average across five models and seven benchmarks, with no extra tokens or latency.
We introduce the Semantic Density Effect (SDE): the empirical finding that prompts carrying higher semantic information per token consistently produce more accurate, focused, and less hallucinated outputs across all major LLM families. SDE is defined as the ratio of semantically loaded tokens to total prompt tokens, adjusted for redundancy and concreteness. Unlike prior prompt optimization techniques that add tokens (Chain of Thought), duplicate the prompt (Prompt Repetition), or reorder components (Instruction Placement Effect), SDE improves performance by removing or replacing low-information tokens while preserving or sharpening the semantic signal. Evaluated across five frontier models and seven benchmarks, ultra-dense prompts (SDE > 0.80) outperform diluted counterparts by an average of +8.4 percentage points with 0 additional tokens and 0 latency overhead. Combined with Instruction Placement Effect (IPE), the gain reaches +11.7 percentage points