CVSep 6, 2025

PictOBI-20k: Unveiling Large Multimodal Models in Visual Decipherment for Pictographic Oracle Bone Characters

arXiv:2509.05773v13 citationsh-index: 3Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of deciphering ancient pictographic characters for scholars in archaeology and linguistics, but it is incremental as it focuses on dataset creation and evaluation rather than a new decipherment method.

The paper tackles the problem of deciphering oracle bone characters (OBCs) by introducing PictOBI-20k, a dataset of 20k images and 15k multi-choice questions to evaluate large multimodal models (LMMs), finding that general LMMs have preliminary skills but rely heavily on language priors rather than visual information.

Deciphering oracle bone characters (OBCs), the oldest attested form of written Chinese, has remained the ultimate, unwavering goal of scholars, offering an irreplaceable key to understanding humanity's early modes of production. Current decipherment methodologies of OBC are primarily constrained by the sporadic nature of archaeological excavations and the limited corpus of inscriptions. With the powerful visual perception capability of large multimodal models (LMMs), the potential of using LMMs for visually deciphering OBCs has increased. In this paper, we introduce PictOBI-20k, a dataset designed to evaluate LMMs on the visual decipherment tasks of pictographic OBCs. It includes 20k meticulously collected OBC and real object images, forming over 15k multi-choice questions. We also conduct subjective annotations to investigate the consistency of the reference point between humans and LMMs in visual reasoning. Experiments indicate that general LMMs possess preliminary visual decipherment skills, and LMMs are not effectively using visual information, while most of the time they are limited by language priors. We hope that our dataset can facilitate the evaluation and optimization of visual attention in future OBC-oriented LMMs. The code and dataset will be available at https://github.com/OBI-Future/PictOBI-20k.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes