For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives
This work addresses the problem for digital humanities researchers who lack tools to study large-scale facial image archives, though it is incremental as it combines existing computer vision techniques with semiotic principles.
The authors tackled the challenge of analyzing socio-cultural implications of facial images on social media at scale by developing FRESCO, a framework that deconstructs images using computer vision and visual semiotics, and validated it with consistency and precision across two public datasets.
Social networks are creating a digital world in which the cognitive, emotional, and pragmatic value of the imagery of human faces and bodies is arguably changing. However, researchers in the digital humanities are often ill-equipped to study these phenomena at scale. This work presents FRESCO (Face Representation in E-Societies through Computational Observation), a framework designed to explore the socio-cultural implications of images on social media platforms at scale. FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques, aligning with the principles of visual semiotics. The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer. These levels are analyzed to discern deeper narrative layers within the imagery. Experimental validation confirms the reliability and utility of FRESCO, and we assess its consistency and precision across two public datasets. Subsequently, we introduce the FRESCO score, a metric derived from the framework's output that serves as a reliable measure of similarity in image content.