Building Efficient CNN Architecture for Offline Handwritten Chinese Character Recognition
This addresses deployment challenges for offline handwritten Chinese character recognition on hardware with limited computation, though it appears incremental in optimization.
The paper tackles the problem of high computational cost and large parameter storage in CNN-based handwritten Chinese character recognition by proposing Global Weighted Average Pooling to reduce parameters and a cascaded model with mid-output layers to speed up inference. The approach achieves 97.1% accuracy on ICDAR-2013 dataset with only 3.3MB storage and 6.9ms average inference time per character.
Deep convolutional networks based methods have brought great breakthrough in images classification, which provides an end-to-end solution for handwritten Chinese character recognition(HCCR) problem through learning discriminative features automatically. Nevertheless, state-of-the-art CNNs appear to incur huge computation cost, and require the storage of a large number of parameters especially in fully connected layers, which is difficult to deploy such networks into alternative hardware device with the limit of computation amount. To solve the storage problem, we propose a novel technique called Global Weighted Arverage Pooling for reducing the parameters in fully connected layer without loss in accuracy. Besides, we implement a cascaded model in single CNN by adding mid output layer to complete recognition as early as possible, which reduces average inference time significantly. Experiments were performed on the ICDAR-2013 offline HCCR dataset, and it is found that the proposed approach only needs 6.9ms for classfying a chracter image on average, and achieves the state-of-the-art accuracy of 97.1% while requiring only 3.3MB for storage.