Salient Object Subitizing
This work addresses the challenge of quickly estimating salient object counts in images, which is incremental as it builds on existing CNN methods with a new dataset and synthetic training enhancements.
The paper tackles the problem of predicting the existence and number of salient objects in images using holistic cues, achieving prediction accuracy comparable to human performance for zero or one object and better-than-chance performance for multiple objects without localization.
We study the problem of Salient Object Subitizing, i.e. predicting the existence and the number of salient objects in an image using holistic cues. This task is inspired by the ability of people to quickly and accurately identify the number of items within the subitizing range (1-4). To this end, we present a salient object subitizing image dataset of about 14K everyday images which are annotated using an online crowdsourcing marketplace. We show that using an end-to-end trained Convolutional Neural Network (CNN) model, we achieve prediction accuracy comparable to human performance in identifying images with zero or one salient object. For images with multiple salient objects, our model also provides significantly better than chance performance without requiring any localization process. Moreover, we propose a method to improve the training of the CNN subitizing model by leveraging synthetic images. In experiments, we demonstrate the accuracy and generalizability of our CNN subitizing model and its applications in salient object detection and image retrieval.