LGFeb 14, 2022

Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens

Felix Ott, David Rügamer, Lucas Heublein, Tim Hamann, Jens Barth, Bernd Bischl, Christopher Mutschler

arXiv:2202.07036v38.724 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of limited data for developing online handwriting recognition methods on paper, which is incremental by providing new datasets and benchmarks for the field.

The paper tackles the lack of data for online handwriting recognition on paper by introducing datasets recorded with a sensor-enhanced pen and benchmarking models, showing that a convolutional network with BiLSTMs outperforms Transformer-based architectures and matches or exceeds 28 state-of-the-art techniques.

Purpose. Handwriting is one of the most frequently occurring patterns in everyday life and with it come challenging applications such as handwriting recognition (HWR), writer identification, and signature verification. In contrast to offline HWR that only uses spatial information (i.e., images), online HWR (OnHWR) uses richer spatio-temporal information (i.e., trajectory data or inertial data). While there exist many offline HWR datasets, there is only little data available for the development of OnHWR methods on paper as it requires hardware-integrated pens. Methods. This paper presents data and benchmark models for real-time sequence-to-sequence (seq2seq) learning and single character-based recognition. Our data is recorded by a sensor-enhanced ballpoint pen, yielding sensor data streams from triaxial accelerometers, a gyroscope, a magnetometer and a force sensor at 100 Hz. We propose a variety of datasets including equations and words for both the writer-dependent and writer-independent tasks. Our datasets allow a comparison between classical OnHWR on tablets and on paper with sensor-enhanced pens. We provide an evaluation benchmark for seq2seq and single character-based HWR using recurrent and temporal convolutional networks and Transformers combined with a connectionist temporal classification (CTC) loss and cross-entropy (CE) losses. Results. Our convolutional network combined with BiLSTMs outperforms Transformer-based architectures, is on par with InceptionTime for sequence-based classification tasks, and yields better results compared to 28 state-of-the-art techniques. Time-series augmentation methods improve the sequence-based task, and we show that CE variants can improve the single classification task.

View on arXiv PDF

Similar