Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

arXiv:2603.06180v1h-index: 1

Predicted impact top 99% in CV · last 90 daysOriginality Highly original

AI Analysis

This work addresses the problem of learning script similarity for historians and linguists by enabling the discovery of latent cross-script similarities without requiring ground-truth evolutionary relationships, which are often contested.

This paper proposes a two-stage framework to learn similarity metrics for glyphs and writing systems, addressing the challenge of uncertain historical relationships between scripts. It first trains an encoder with contrastive loss on labeled invented alphabets and then extends to historically attested scripts through teacher-student distillation, enabling effective few-shot glyph recognition and meaningful script clustering without ground-truth evolutionary relationships.

Learning similarity metrics for glyphs and writing systems faces a fundamental challenge: while individual graphemes within invented alphabets can be reliably labeled, the historical relationships between different scripts remain uncertain and contested. We propose a two-stage framework that addresses this epistemological constraint. First, we train an encoder with contrastive loss on labeled invented alphabets, establishing a teacher model with robust discriminative features. Second, we extend to historically attested scripts through teacher-student distillation, where the student learns unsupervised representations guided by the teacher's knowledge but free to discover latent cross-script similarities. The asymmetric setup enables the student to learn deformation-invariant embeddings while inheriting discriminative structure from clean examples. Our approach bridges supervised contrastive learning and unsupervised discovery, enabling both hard boundaries between distinct systems and soft similarities reflecting potential historical influences. Experiments on diverse writing systems demonstrate effective few-shot glyph recognition and meaningful script clustering without requiring ground-truth evolutionary relationships.

View on arXiv PDF

Similar