CVNov 16, 2018

Assessing four Neural Networks on Handwritten Digit Recognition Dataset (MNIST)

arXiv:1811.08278v247 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for image recognition researchers, focusing on MNIST dataset performance.

The paper compares four neural networks on MNIST handwritten digit recognition, finding that their CapsNet improvement achieves 99.75% accuracy, the best published result, and requires only a small amount of data.

Although the image recognition has been a research topic for many years, many researchers still have a keen interest in it[1]. In some papers[2][3][4], however, there is a tendency to compare models only on one or two datasets, either because of time restraints or because the model is tailored to a specific task. Accordingly, it is hard to understand how well a certain model generalizes across image recognition field[6]. In this paper, we compare four neural networks on MNIST dataset[5] with different division. Among them, three are Convolutional Neural Networks (CNN)[7], Deep Residual Network (ResNet)[2] and Dense Convolutional Network (DenseNet)[3] respectively, and the other is our improvement on CNN baseline through introducing Capsule Network (CapsNet)[1] to image recognition area. We show that the previous models despite do a quite good job in this area, our retrofitting can be applied to get a better performance. The result obtained by CapsNet is an accuracy rate of 99.75\%, and it is the best result published so far. Another inspiring result is that CapsNet only needs a small amount of data to get excellent performance. Finally, we will apply CapsNet's ability to generalize in other image recognition field in the future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes