CVCLMar 23

BHDD: A Burmese Handwritten Digit Dataset

arXiv:2603.219667.21 citationsh-index: 1Has Code
Predicted impact top 97% in CV · last 90 daysOriginality Synthesis-oriented
AI Analysis

This provides a new dataset for Burmese digit recognition, addressing a gap in resources for Myanmar script, but it is incremental as it follows the MNIST format.

The authors introduced the Burmese Handwritten Digit Dataset (BHDD), containing 87,561 grayscale images of handwritten Burmese digits, and achieved up to 99.83% test accuracy using a CNN with batch normalization and augmentation.

We introduce the Burmese Handwritten Digit Dataset (BHDD), a collection of 87,561 grayscale images of handwritten Burmese digits in ten classes. Each image is 28x28 pixels, following the MNIST format. The training set has 60,000 samples split evenly across classes; the test set has 27,561 samples with class frequencies as they arose during collection. Over 150 people of different ages and backgrounds contributed samples. We analyze the dataset's class distribution, pixel statistics, and morphological variation, and identify digit pairs that are easily confused due to the round shapes of the Myanmar script. Simple baselines (an MLP, a two-layer CNN, and an improved CNN with batch normalization and augmentation) reach 99.40%, 99.75%, and 99.83% test accuracy respectively. BHDD is available under CC BY-SA 4.0 at https://github.com/baseresearch/BHDD

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes