CLAIJun 1

PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing

arXiv:2606.0201092.5
Predicted impact top 22% in CL · last 90 daysOriginality Incremental advance
AI Analysis

Provides a novel, memorization-resistant spatial reasoning benchmark for LLMs, revealing edge count as a key difficulty factor overlooked in prior graph benchmarks.

PlanarBench tests LLMs on planar graph drawing from edge lists, finding edge count is the main difficulty predictor (r=-0.85), with 91 models evaluated on 199 graphs.

PlanarBench tests whether LLMs can draw planar graphs as ASCII art given only an edge list -- a spatial reasoning task that resists memorization because edge order, edge orientation, and node labels are all permutable. We evaluate 91 models on the 199 simplest non-isomorphic connected planar graphs (2 - 7 vertices). Edge count is the dominant difficulty predictor ($r = -0.85$) -- a finding not reported in prior LLM graph benchmarks, which use only node count as the difficulty axis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes