Learning Multi-level Deep Representations for Image Emotion Classification
This addresses the problem of accurately classifying emotions in images for applications like content analysis, though it is incremental by integrating existing feature types.
The paper tackled image emotion classification by proposing MldrNet, a deep network that combines multi-level representations (semantics, aesthetics, low-level features), achieving at least 6% improvement in accuracy over state-of-the-art methods on datasets including Internet images and abstract paintings.
In this paper, we propose a new deep network that learns multi-level deep representations for image emotion classification (MldrNet). Image emotion can be recognized through image semantics, image aesthetics and low-level visual features from both global and local views. Existing image emotion classification works using hand-crafted features or deep features mainly focus on either low-level visual features or semantic-level image representations without taking all factors into consideration. The proposed MldrNet combines deep representations of different levels, i.e. image semantics, image aesthetics, and low-level visual features to effectively classify the emotion types of different kinds of images, such as abstract paintings and web images. Extensive experiments on both Internet images and abstract paintings demonstrate the proposed method outperforms the state-of-the-art methods using deep features or hand-crafted features. The proposed approach also outperforms the state-of-the-art methods with at least 6% performance improvement in terms of overall classification accuracy.