Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
This work addresses the need for better usability prediction in web design, though it is incremental by integrating existing deep learning techniques with heuristic features.
The paper tackled the problem of predicting visual search performance on realistic webpages by developing a deep neural network that combines heuristic and unstructured features, achieving improved modeling of complex interactions compared to traditional analytical methods.
Modeling visual search not only offers an opportunity to predict the usability of an interface before actually testing it on real users, but also advances scientific understanding about human behavior. In this work, we first conduct a set of analyses on a large-scale dataset of visual search tasks on realistic webpages. We then present a deep neural network that learns to predict the scannability of webpage content, i.e., how easy it is for a user to find a specific target. Our model leverages both heuristic-based features such as target size and unstructured features such as raw image pixels. This approach allows us to model complex interactions that might be involved in a realistic visual search task, which can not be easily achieved by traditional analytical models. We analyze the model behavior to offer our insights into how the salience map learned by the model aligns with human intuition and how the learned semantic representation of each target type relates to its visual search performance.