Places205-VGGNet Models for Scene Recognition
This work improves scene recognition accuracy for computer vision applications, but it is incremental as it applies an existing method to a new dataset.
The authors tackled the problem of scene recognition by training VGGNet models on the Places205 dataset, achieving state-of-the-art performance on MIT67, SUN397, and Places205 datasets.
VGGNets have turned out to be effective for object recognition in still images. However, it is unable to yield good performance by directly adapting the VGGNet models trained on the ImageNet dataset for scene recognition. This report describes our implementation of training the VGGNets on the large-scale Places205 dataset. Specifically, we train three VGGNet models, namely VGGNet-11, VGGNet-13, and VGGNet-16, by using a Multi-GPU extension of Caffe toolbox with high computational efficiency. We verify the performance of trained Places205-VGGNet models on three datasets: MIT67, SUN397, and Places205. Our trained models achieve the state-of-the-art performance on these datasets and are made public available.