CVAIJul 31, 2018

What am I Searching for: Zero-shot Target Identity Inference in Visual Search

arXiv:1807.11926v2
Originality Incremental advance
AI Analysis

This work addresses the challenge of intention inference in visual search for applications in human-computer interaction and cognitive science, though it is incremental as it builds on existing eye-tracking and neural network methods.

The paper tackled the problem of inferring a person's search target from their eye movements, specifically using error fixations on non-target objects, and developed InferNet which successfully identifies the target without object-specific training, outperforming null models.

Can we infer intentions from a person's actions? As an example problem, here we consider how to decipher what a person is searching for by decoding their eye movement behavior. We conducted two psychophysics experiments where we monitored eye movements while subjects searched for a target object. We defined the fixations falling on non-target objects as "error fixations". Using those error fixations, we developed a model (InferNet) to infer what the target was. InferNet uses a pre-trained convolutional neural network to extract features from the error fixations and computes a similarity map between the error fixations and all locations across the search image. The model consolidates the similarity maps across layers and integrates these maps across all error fixations. InferNet successfully identifies the subject's goal and outperforms competitive null models, even without any object-specific training on the inference task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes