Visual saliency estimation by integrating features using multiple kernel learning
This work addresses a key challenge in visual saliency estimation for computer vision applications, but it is incremental as it builds on existing supervised learning approaches with a novel integration method.
The paper tackles the problem of determining contributions of different visual features to overall saliency in images by proposing a multiple kernel learning (MKL) framework that integrates features at an intermediate level and includes object-specific features from Object-Bank, achieving state-of-the-art performance compared to SVM or AdaBoost-based models.
In the last few decades, significant achievements have been attained in predicting where humans look at images through different computational models. However, how to determine contributions of different visual features to overall saliency still remains an open problem. To overcome this issue, a recent class of models formulates saliency estimation as a supervised learning problem and accordingly apply machine learning techniques. In this paper, we also address this challenging problem and propose to use multiple kernel learning (MKL) to combine information coming from different feature dimensions and to perform integration at an intermediate level. Besides, we suggest to use responses of a recently proposed filterbank of object detectors, known as Object-Bank, as additional semantic high-level features. Here we show that our MKL-based framework together with the proposed object-specific features provide state-of-the-art performance as compared to SVM or AdaBoost-based saliency models.