Visual Tree Convolutional Neural Network in Image Classification
This work addresses a specific challenge in image classification for researchers and practitioners by focusing on confused categories, but it is incremental as it builds on existing CNN methods with modest gains.
The paper tackled the problem of improving classification accuracy on confused categories in image datasets by proposing Visual Tree Convolutional Neural Networks (VT-CNN), which embed a Confusion Visual Tree (CVT) to guide training; the models achieved improvements of 1.36%, 0.89%, and 0.64% over baseline CNNs on CIFAR-10 and CIFAR-100.
In image classification, Convolutional Neural Network(CNN) models have achieved high performance with the rapid development in deep learning. However, some categories in the image datasets are more difficult to distinguished than others. Improving the classification accuracy on these confused categories is benefit to the overall performance. In this paper, we build a Confusion Visual Tree(CVT) based on the confused semantic level information to identify the confused categories. With the information provided by the CVT, we can lead the CNN training procedure to pay more attention on these confused categories. Therefore, we propose Visual Tree Convolutional Neural Networks(VT-CNN) based on the original deep CNN embedded with our CVT. We evaluate our VT-CNN model on the benchmark datasets CIFAR-10 and CIFAR-100. In our experiments, we build up 3 different VT-CNN models and they obtain improvement over their based CNN models by 1.36%, 0.89% and 0.64%, respectively.