CVApr 22, 2019
Leveraging Orientation for Weakly Supervised Object Detection with Application to Firearm LocalizationJaved Iqbal, Muhammad Akhtar Munir, Arif Mahmood et al.
Automatic detection of firearms is important for enhancing the security and safety of people, however, it is a challenging task owing to the wide variations in shape, size, and appearance of firearms. Also, most of the generic object detectors process axis-aligned rectangular areas though, a thin and long rifle may actually cover only a small percentage of that area and the rest may contain irrelevant details suppressing the required object signatures. To handle these challenges, we propose a weakly supervised Orientation Aware Object Detection (OAOD) algorithm which learns to detect oriented object bounding boxes (OBB) while using AxisAligned Bounding Boxes (AABB) for training. The proposed OAOD is different from the existing oriented object detectors which strictly require OBB during training which may not always be present. The goal of training on AABB and detection of OBB is achieved by employing a multistage scheme, with Stage-1 predicting the AABB and Stage-2 predicting OBB. In-between the two stages, the oriented proposal generation module along with the object aligned RoI pooling is designed to extract features based on the predicted orientation and to make these features orientation invariant. A diverse and challenging dataset consisting of eleven thousand images is also proposed for firearm detection which is manually annotated for firearm classification and localization. The proposed ITU Firearm dataset (ITUF) contains a wide range of guns and rifles. The OAOD algorithm is evaluated on the ITUF dataset and compared with current state-of-the-art object detectors, including fully supervised oriented object detectors. OAOD has outperformed both types of object detectors with a significant margin. The experimental results (mAP: 88.3 on AABB & mAP: 77.5 on OBB) demonstrate the effectiveness of the proposed algorithm for firearm detection.
CVJul 25, 2017
Automatic Image Transformation for Inducing AffectAfsheen Rafaqat Ali, Mohsen Ali
Current image transformation and recoloring algorithms try to introduce artistic effects in the photographed images, based on user input of target image(s) or selection of pre-designed filters. These manipulations, although intended to enhance the impact of an image on the viewer, do not include the option of image transformation by specifying the affect information. In this paper we present an automatic image-transformation method that transforms the source image such that it can induce an emotional affect on the viewer, as desired by the user. Our proposed novel image emotion transfer algorithm does not require a user-specified target image. The proposed algorithm uses features extracted from top layers of deep convolutional neural network and the user-specified emotion distribution to select multiple target images from an image database for color transformation, such that the resultant image has desired emotional impact. Our method can handle more diverse set of photographs than the previous methods. We conducted a detailed user study showing the effectiveness of our proposed method. A discussion and reasoning of failure cases has also been provided, indicating inherent limitation of color-transfer based methods in the use of emotion assignment. Project Page: http://im.itu.edu.pk/affective-image-transfer/
CVMay 8, 2017
High-Level Concepts for Affective Understanding of ImagesAfsheen Rafaqat Ali, Usman Shahid, Mohsen Ali et al.
This paper aims to bridge the affective gap between image content and the emotional response of the viewer it elicits by using High-Level Concepts (HLCs). In contrast to previous work that relied solely on low-level features or used convolutional neural network (CNN) as a black-box, we use HLCs generated by pretrained CNNs in an explicit way to investigate the relations/associations between these HLCs and a (small) set of Ekman's emotional classes. As a proof-of-concept, we first propose a linear admixture model for modeling these relations, and the resulting computational framework allows us to determine the associations between each emotion class and certain HLCs (objects and places). This linear model is further extended to a nonlinear model using support vector regression (SVR) that aims to predict the viewer's emotional response using both low-level image features and HLCs extracted from images. These class-specific regressors are then assembled into a regressor ensemble that provide a flexible and effective predictor for predicting viewer's emotional responses from images. Experimental results have demonstrated that our results are comparable to existing methods, with a clear view of the association between HLCs and emotional classes that is ostensibly missing in most existing work.