CVAIHCAug 17, 2021

Neural Photofit: Gaze-based Mental Image Reconstruction

arXiv:2108.07524v114 citations
Originality Incremental advance
AI Analysis

This addresses the problem of mental image reconstruction for applications in psychology or forensics, but it is incremental as it builds on existing neural network approaches with a novel gaze-based dataset.

The paper tackles the problem of reconstructing facial images from human gaze patterns, proposing a method that combines three neural networks to decode mental images into photofits. The result shows the method significantly outperforms a baseline and produces visually plausible reconstructions close to observers' mental images, as validated in a human study with 19 participants.

We propose a novel method that leverages human fixations to visually decode the image a person has in mind into a photofit (facial composite). Our method combines three neural networks: An encoder, a scoring network, and a decoder. The encoder extracts image features and predicts a neural activation map for each face looked at by a human observer. A neural scoring network compares the human and neural attention and predicts a relevance score for each extracted image feature. Finally, image features are aggregated into a single feature vector as a linear combination of all features weighted by relevance which a decoder decodes into the final photofit. We train the neural scoring network on a novel dataset containing gaze data of 19 participants looking at collages of synthetic faces. We show that our method significantly outperforms a mean baseline predictor and report on a human study that shows that we can decode photofits that are visually plausible and close to the observer's mental image.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes