Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology
This work addresses the need for culturally and linguistically aware sentiment analysis in social multimedia, providing a resource for improved cross-lingual models.
The authors tackled the problem of cultural and linguistic uniqueness in visual sentiment by developing a large-scale multilingual visual sentiment ontology (MVSO) using a new language-dependent method for discovering adjective-noun pairs, resulting in a dataset of over 15.6K concepts across 12 languages with 7.36M images.
Every culture and language is unique. Our work expressly focuses on the uniqueness of culture and language in relation to human affect, specifically sentiment and emotion semantics, and how they manifest in social multimedia. We develop sets of sentiment- and emotion-polarized visual concepts by adapting semantic structures called adjective-noun pairs, originally introduced by Borth et al. (2013), but in a multilingual context. We propose a new language-dependent method for automatic discovery of these adjective-noun constructs. We show how this pipeline can be applied on a social multimedia platform for the creation of a large-scale multilingual visual sentiment concept ontology (MVSO). Unlike the flat structure in Borth et al. (2013), our unified ontology is organized hierarchically by multilingual clusters of visually detectable nouns and subclusters of emotionally biased versions of these nouns. In addition, we present an image-based prediction task to show how generalizable language-specific models are in a multilingual context. A new, publicly available dataset of >15.6K sentiment-biased visual concepts across 12 languages with language-specific detector banks, >7.36M images and their metadata is also released.