CVDec 20, 2013

Learning Generative Models with Visual Attention

Yichuan Tang, Nitish Srivastava, Ruslan Salakhutdinov

arXiv:1312.6110v387 citations

Originality Incremental advance

AI Analysis

This addresses the need for object-centric data in generative modeling, particularly for handling background clutter in large images, though it is incremental as it builds on existing attentional mechanisms.

The paper tackles the problem of learning generative models from cluttered images by using visual attention to focus on objects of interest, resulting in a model that can robustly attend to face regions in novel test subjects and learn from new datasets without known face locations.

Attention has long been proposed by psychologists as important for effectively dealing with the enormous sensory stimulus available in the neocortex. Inspired by the visual attention models in computational neuroscience and the need of object-centric data for generative models, we describe for generative learning framework using attentional mechanisms. Attentional mechanisms can propagate signals from region of interest in a scene to an aligned canonical representation, where generative modeling takes place. By ignoring background clutter, generative models can concentrate their resources on the object of interest. Our model is a proper graphical model where the 2D Similarity transformation is a part of the top-down process. A ConvNet is employed to provide good initializations during posterior inference which is based on Hamiltonian Monte Carlo. Upon learning images of faces, our model can robustly attend to face regions of novel test subjects. More importantly, our model can learn generative models of new faces from a novel dataset of large images where the face locations are not known.

View on arXiv PDF

Similar