CL AIJun 1

PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing

arXiv:2606.0201092.5

Predicted impact top 22% in CL · last 90 daysOriginality Incremental advance

AI Analysis

Provides a novel, memorization-resistant spatial reasoning benchmark for LLMs, revealing edge count as a key difficulty factor overlooked in prior graph benchmarks.

PlanarBench tests LLMs on planar graph drawing from edge lists, finding edge count is the main difficulty predictor (r=-0.85), with 91 models evaluated on 199 graphs.

PlanarBench tests whether LLMs can draw planar graphs as ASCII art given only an edge list -- a spatial reasoning task that resists memorization because edge order, edge orientation, and node labels are all permutable. We evaluate 91 models on the 199 simplest non-isomorphic connected planar graphs (2 - 7 vertices). Edge count is the dominant difficulty predictor ($r = -0.85$) -- a finding not reported in prior LLM graph benchmarks, which use only node count as the difficulty axis.

View on arXiv PDF

Similar