CVMar 5, 2023

A Study of Augmentation Methods for Handwritten Stenography Recognition

arXiv:2303.02761v12.85 citationsh-index: 5

Originality Synthesis-oriented

AI Analysis

This work addresses data scarcity for stenography recognition, an incremental improvement over existing HTR methods.

The study tackled the problem of limited annotated training data for handwritten stenography recognition by evaluating 22 classical augmentation techniques, identifying specific methods like random rotation, shifts, and scaling that improve performance, while noting others that degrade it, with results validated by statistical testing.

One of the factors limiting the performance of handwritten text recognition (HTR) for stenography is the small amount of annotated training data. To alleviate the problem of data scarcity, modern HTR methods often employ data augmentation. However, due to specifics of the stenographic script, such settings may not be directly applicable for stenography recognition. In this work, we study 22 classical augmentation techniques, most of which are commonly used for HTR of other scripts, such as Latin handwriting. Through extensive experiments, we identify a group of augmentations, including for example contained ranges of random rotation, shifts and scaling, that are beneficial to the use case of stenography recognition. Furthermore, a number of augmentation approaches, leading to a decrease in recognition performance, are identified. Our results are supported by statistical hypothesis testing. Links to the publicly available dataset and codebase are provided.

View on arXiv PDF

Similar