Visual Graph Scaffolds for Structural Reasoning in Large Language Models
For LLM researchers, it demonstrates that visual graph structures can serve as internal reasoning aids, not just external knowledge sources, highlighting a modality gap.
The paper investigates whether visual graph scaffolds can improve reasoning in LLMs beyond providing external knowledge. Experiments on multi-hop QA show that visual graph guidance remains effective even without direct answer hints, while text-based graph guidance degrades, supporting the use of graphs as visual reasoning organizers.
Graphs have been used to enhance large language models (LLMs) for structured reasoning, mostly as external knowledge sources are provided to models at test time. In this paper, we take a different view: the value of graphs for LLMs lie not only in supplying information, but also in organizing reasoning. Inspired by how humans use graph-structured mind maps to organize branching and converging thoughts, we ask whether graphs can serve as an internal form of reasoning assistance. We study this question on multi-hop question answering tasks, where teacher-provided reasoning traces are rewritten as graph mind maps and used to guide a student model. Our experiments reveal a clear modality gap. When graph structures are flattened into text, their benefits become limited once direct answer hints are removed. Under this abstract guidance setting, both reasoning efficiency and answer quality degrade substantially. In contrast, visual graph guidance remains effective without direct answer clues, and its advantage persists after supervised fine-tuning and KL-based distillation. The above findings support the claim that graphs should be studied not only as external knowledge structures for LLMs, but also as visual scaffolds for organizing reasoning.