CVMar 25, 2025

Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts

arXiv:2503.19546v11 citationsh-index: 1ICDAR
Originality Incremental advance
AI Analysis

This work addresses the incremental adaptation of OCR models for users correcting handwritten text, improving efficiency and reducing workload in practical applications.

The paper tackles the problem of adapting OCR models during user correction of handwritten text recognition, showing that fine-tuning with as few as 16 lines yields a 10% relative CER improvement, scaling to 40% with 256 lines, and reduces annotation costs by half through confidence-based selection.

A common use case for OCR applications involves users uploading documents and progressively correcting automatic recognition to obtain the final transcript. This correction phase presents an opportunity for progressive adaptation of the OCR model, making it crucial to adapt early, while ensuring stability and reliability. We demonstrate that state-of-the-art transformer-based models can effectively support this adaptation, gradually reducing the annotator's workload. Our results show that fine-tuning can reliably start with just 16 lines, yielding a 10% relative improvement in CER, and scale up to 40% with 256 lines. We further investigate the impact of model components, clarifying the roles of the encoder and decoder in the fine-tuning process. To guide adaptation, we propose reliable stopping criteria, considering both direct approaches and global trend analysis. Additionally, we show that OCR models can be leveraged to cut annotation costs by half through confidence-based selection of informative lines, achieving the same performance with fewer annotations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes