CVNov 30, 2016

Sequential Person Recognition in Photo Albums with a Recurrent Network

arXiv:1611.09967v129 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of person recognition in photo albums for computer vision applications, but it is incremental as it builds on existing relational modeling approaches.

The paper tackles the problem of recognizing people in everyday photos by modeling relational information as a sequence prediction task, achieving state-of-the-art performance on the PIPA dataset.

Recognizing the identities of people in everyday photos is still a very challenging problem for machine vision, due to non-frontal faces, changes in clothing, location, lighting and similar. Recent studies have shown that rich relational information between people in the same photo can help in recognizing their identities. In this work, we propose to model the relational information between people as a sequence prediction task. At the core of our work is a novel recurrent network architecture, in which relational information between instances' labels and appearance are modeled jointly. In addition to relational cues, scene context is incorporated in our sequence prediction model with no additional cost. In this sense, our approach is a unified framework for modeling both contextual cues and visual appearance of person instances. Our model is trained end-to-end with a sequence of annotated instances in a photo as inputs, and a sequence of corresponding labels as targets. We demonstrate that this simple but elegant formulation achieves state-of-the-art performance on the newly released People In Photo Albums (PIPA) dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes