CVAIHCMar 20, 2024

Learning User Embeddings from Human Gaze for Personalised Saliency Prediction

arXiv:2403.13653v25 citationsh-index: 12Proc. ACM Hum. Comput. Interact.
Originality Incremental advance
AI Analysis

This work addresses personalized saliency prediction for users in computer vision applications, but it is incremental as it builds on prior embedding methods by using gaze data instead of explicit user inputs.

The paper tackled the problem of personalized saliency prediction by learning user embeddings from eye tracking data, eliminating the need for explicit user characteristics. The method achieved high discriminative power and effectively refined universal saliency maps, generalizing well across users and images.

Reusable embeddings of user behaviour have shown significant performance improvements for the personalised saliency prediction task. However, prior works require explicit user characteristics and preferences as input, which are often difficult to obtain. We present a novel method to extract user embeddings from pairs of natural images and corresponding saliency maps generated from a small amount of user-specific eye tracking data. At the core of our method is a Siamese convolutional neural encoder that learns the user embeddings by contrasting the image and personal saliency map pairs of different users. Evaluations on two public saliency datasets show that the generated embeddings have high discriminative power, are effective at refining universal saliency maps to the individual users, and generalise well across users and images. Finally, based on our model's ability to encode individual user characteristics, our work points towards other applications that can benefit from reusable embeddings of gaze behaviour.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes