CVMay 19, 2015

Character-level Chinese Writer Identification using Path Signature Feature, DropStroke and Deep CNN

arXiv:1505.04922v112 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of text-independent writer identification for forensic or authentication applications, but it is incremental as it builds on existing deep learning and data augmentation techniques.

The paper tackles writer identification from limited Chinese handwriting samples by introducing a path-signature feature and DropStroke data augmentation in an end-to-end deep CNN system, achieving significant performance improvements on a dataset of 420 writers with only 200 training samples per writer.

Most existing online writer-identification systems require that the text content is supplied in advance and rely on separately designed features and classifiers. The identifications are based on lines of text, entire paragraphs, or entire documents; however, these materials are not always available. In this paper, we introduce a path-signature feature to an end-to-end text-independent writer-identification system with a deep convolutional neural network (DCNN). Because deep models require a considerable amount of data to achieve good performance, we propose a data-augmentation method named DropStroke to enrich personal handwriting. Experiments were conducted on online handwritten Chinese characters from the CASIA-OLHWDB1.0 dataset, which consists of 3,866 classes from 420 writers. For each writer, we only used 200 samples for training and the remaining 3,666. The results reveal that the path-signature feature is useful for writer identification, and the proposed DropStroke technique enhances the generalization and significantly improves performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes