LGOct 29, 2025

Bridging Vision, Language, and Mathematics: Pictographic Character Reconstruction with Bézier Curves

arXiv:2511.00076v11 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the challenge of geometric understanding in vision-language models for applications in visual recognition and program synthesis, showing incremental progress by applying existing methods to a new domain with strong generalization results.

The paper tackles the problem of interpreting geometric structure in visual information by training a vision-language model to decompile raster images of pictographic characters into executable programs of Bézier curves, achieving superior performance to zero-shot baselines like GPT-4o and demonstrating zero-shot generalization from modern Chinese characters to ancient Oracle Bone Script.

While Vision-language Models (VLMs) have demonstrated strong semantic capabilities, their ability to interpret the underlying geometric structure of visual information is less explored. Pictographic characters, which combine visual form with symbolic structure, provide an ideal test case for this capability. We formulate this visual recognition challenge in the mathematical domain, where each character is represented by an executable program of geometric primitives. This is framed as a program synthesis task, training a VLM to decompile raster images into programs composed of Bézier curves. Our model, acting as a "visual decompiler", demonstrates performance superior to strong zero-shot baselines, including GPT-4o. The most significant finding is that when trained solely on modern Chinese characters, the model is able to reconstruct ancient Oracle Bone Script in a zero-shot context. This generalization provides strong evidence that the model acquires an abstract and transferable geometric grammar, moving beyond pixel-level pattern recognition to a more structured form of visual understanding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes