A model for interpreting social interactions in local image regions
This addresses the challenge of visual recognition of social interactions for AI systems, but appears incremental as it builds on existing concepts of minimal images and interpretation.
The paper tackled the problem of recognizing social interactions in images by identifying minimal local regions that reliably convey interactions like 'hug' or 'fight', and modeling their interpretation through components and relations, with results supported by psychophysics data and modeling.
Understanding social interactions (such as 'hug' or 'fight') is a basic and important capacity of the human visual system, but a challenging and still open problem for modeling. In this work we study visual recognition of social interactions, based on small but recognizable local regions. The approach is based on two novel key components: (i) A given social interaction can be recognized reliably from reduced images (called 'minimal images'). (ii) The recognition of a social interaction depends on identifying components and relations within the minimal image (termed 'interpretation'). We show psychophysics data for minimal images and modeling results for their interpretation. We discuss the integration of minimal configurations in recognizing social interactions in a detailed, high-resolution image.