Combined convolutional and recurrent neural networks for hierarchical classification of images
This work addresses hierarchical image classification, which could improve accuracy in domains with structured object categories, but it is incremental as it builds on existing deep learning techniques.
The authors tackled image classification by exploiting hierarchical relations among object classes, proposing a model that combines CNNs with RNNs or sequence-to-sequence models and residual learning, which outperformed state-of-the-art CNNs on a proprietary dataset.
Deep learning models based on CNNs are predominantly used in image classification tasks. Such approaches, assuming independence of object categories, normally use a CNN as a feature learner and apply a flat classifier on top of it. Object classes in many settings have hierarchical relations, and classifiers exploiting these relations should perform better. We propose hierarchical classification models combining a CNN to extract hierarchical representations of images, and an RNN or sequence-to-sequence model to capture a hierarchical tree of classes. In addition, we apply residual learning to the RNN part in oder to facilitate training our compound model and improve generalization of the model. Experimental results on a real world proprietary dataset of images show that our hierarchical networks perform better than state-of-the-art CNNs.