Predicting Visual Memory Schemas with Variational Autoencoders
This work addresses the need for more detailed VMS maps in cognitive science and computer vision, though it appears incremental as it builds on prior CNN-based methods.
The paper tackled the problem of generating visual memory schema (VMS) maps by framing it as an image-to-image translation task using a variational autoencoder, resulting in higher-resolution dual-channel images that separately represent true and false memorability.
Visual memory schema (VMS) maps show which regions of an image cause that image to be remembered or falsely remembered. Previous work has succeeded in generating low resolution VMS maps using convolutional neural networks. We instead approach this problem as an image-to-image translation task making use of a variational autoencoder. This approach allows us to generate higher resolution dual channel images that represent visual memory schemas, allowing us to evaluate predicted true memorability and false memorability separately. We also evaluate the relationship between VMS maps, predicted VMS maps, ground truth memorability scores, and predicted memorability scores.