How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning
This addresses the challenge of costly annotation for valuable but small historical archives, though it is incremental as it builds on existing fine-tuning methods.
The paper tackled the problem of adapting pretrained Handwriting Recognition models to small, single-author historical document collections with peculiar characteristics, achieving effective transcription with as little as five real fine-tuning lines through quantitative analysis of data characteristics.
Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on both modern and historical manuscripts in large benchmark datasets. Nonetheless, those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting. This issue is very relevant for valuable but small collections of documents preserved in historical archives, for which obtaining sufficient annotated training data is costly or, in some cases, unfeasible. To overcome this challenge, a possible solution is to pretrain HTR models on large datasets and then fine-tune them on small single-author collections. In this paper, we take into account large, real benchmark datasets and synthetic ones obtained with a styled Handwritten Text Generation model. Through extensive experimental analysis, also considering the amount of fine-tuning lines, we give a quantitative indication of the most relevant characteristics of such data for obtaining an HTR model able to effectively transcribe manuscripts in small collections with as little as five real fine-tuning lines.