CVLGSep 21, 2025

Optimal Transport for Handwritten Text Recognition in a Low-Resource Regime

arXiv:2509.16977v1h-index: 16
Originality Incremental advance
AI Analysis

This addresses the challenge of limited annotated data for handwritten text recognition, particularly in domains like historical archives, with an incremental method.

The paper tackles the problem of handwritten text recognition in low-resource settings by introducing an iterative bootstrapping framework that uses optimal transport to align visual features with semantic word representations, resulting in significant accuracy improvements on benchmarks.

Handwritten Text Recognition (HTR) is a task of central importance in the field of document image understanding. State-of-the-art methods for HTR require the use of extensive annotated sets for training, making them impractical for low-resource domains like historical archives or limited-size modern collections. This paper introduces a novel framework that, unlike the standard HTR model paradigm, can leverage mild prior knowledge of lexical characteristics; this is ideal for scenarios where labeled data are scarce. We propose an iterative bootstrapping approach that aligns visual features extracted from unlabeled images with semantic word representations using Optimal Transport (OT). Starting with a minimal set of labeled examples, the framework iteratively matches word images to text labels, generates pseudo-labels for high-confidence alignments, and retrains the recognizer on the growing dataset. Numerical experiments demonstrate that our iterative visual-semantic alignment scheme significantly improves recognition accuracy on low-resource HTR benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes