CVCLJan 9

Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

arXiv:2601.05508v1h-index: 6Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of structural blindness in AI models for analyzing logographic writing systems, offering a generalizable tool for graphematics analysis, though it is incremental as it builds on existing MLLM capabilities.

The paper tackles the problem of enabling multimodal large language models to analyze the stroke-level structure of hieroglyphic scripts without relying on language-specific priors, resulting in a framework that automatically derives interpretable line-segment representations from character images and demonstrates effectiveness in capturing internal structures and semantics.

Hieroglyphs, as logographic writing systems, encode rich semantic and cultural information within their internal structural composition. Yet, current advanced Large Language Models (LLMs) and Multimodal LLMs (MLLMs) usually remain structurally blind to this information. LLMs process characters as textual tokens, while MLLMs additionally view them as raw pixel grids. Both fall short to model the underlying logic of character strokes. Furthermore, existing structural analysis methods are often script-specific and labor-intensive. In this paper, we propose Hieroglyphic Stroke Analyzer (HieroSA), a novel and generalizable framework that enables MLLMs to automatically derive stroke-level structures from character bitmaps without handcrafted data. It transforms modern logographic and ancient hieroglyphs character images into explicit, interpretable line-segment representations in a normalized coordinate space, allowing for cross-lingual generalization. Extensive experiments demonstrate that HieroSA effectively captures character-internal structures and semantics, bypassing the need for language-specific priors. Experimental results highlight the potential of our work as a graphematics analysis tool for a deeper understanding of hieroglyphic scripts. View our code at https://github.com/THUNLP-MT/HieroSA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes