CVJul 9, 2020

Maximum Entropy Regularization and Chinese Text Recognition

Changxu Cheng, Wuheng Xu, Xiang Bai, Bin Feng, Wenyu Liu

arXiv:2007.04651v14.24 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of overfitting in Chinese text recognition for applications like OCR and image classification, though it is incremental as it builds on existing regularization techniques.

The paper tackles the overfitting problem in Chinese text recognition caused by class imbalance and fine-grained characters by proposing Maximum Entropy Regularization, which adds a negative entropy term to the cross-entropy loss, achieving consistent improvement in experiments on Chinese character recognition, Chinese text line recognition, and fine-grained image classification.

Chinese text recognition is more challenging than Latin text due to the large amount of fine-grained Chinese characters and the great imbalance over classes, which causes a serious overfitting problem. We propose to apply Maximum Entropy Regularization to regularize the training process, which is to simply add a negative entropy term to the canonical cross-entropy loss without any additional parameters and modification of a model. We theoretically give the convergence probability distribution and analyze how the regularization influence the learning process. Experiments on Chinese character recognition, Chinese text line recognition and fine-grained image classification achieve consistent improvement, proving that the regularization is beneficial to generalization and robustness of a recognition model.

View on arXiv PDF

Similar