End-to-End Saliency Mapping via Probability Distribution Prediction
This work addresses a specific problem in computer vision for saliency estimation, offering an incremental improvement by better aligning training objectives with evaluation metrics.
The paper tackled the mismatch between saliency map evaluation and existing loss functions by modeling saliency maps as generalized Bernoulli distributions and introducing novel loss functions based on probability distribution distances. It demonstrated improved performance over state-of-the-art methods on four public benchmark datasets.
Most saliency estimation methods aim to explicitly model low-level conspicuity cues such as edges or blobs and may additionally incorporate top-down cues using face or text detection. Data-driven methods for training saliency models using eye-fixation data are increasingly popular, particularly with the introduction of large-scale datasets and deep architectures. However, current methods in this latter paradigm use loss functions designed for classification or regression tasks whereas saliency estimation is evaluated on topographical maps. In this work, we introduce a new saliency map model which formulates a map as a generalized Bernoulli distribution. We then train a deep architecture to predict such maps using novel loss functions which pair the softmax activation function with measures designed to compute distances between probability distributions. We show in extensive experiments the effectiveness of such loss functions over standard ones on four public benchmark datasets, and demonstrate improved performance over state-of-the-art saliency methods.