CVMay 30, 2023

Recognizing People by Body Shape Using Deep Networks of Images and Words

arXiv:2305.19160v110 citations
Originality Incremental advance
AI Analysis

This addresses the problem of identifying individuals when faces are not visible, which is important for surveillance and security applications, but it is incremental as it builds on existing deep learning methods with a novel fusion approach.

The paper tackles person identification from body shape across varying distances and viewpoints, showing that combining linguistic and non-linguistic deep network representations improves accuracy and reduces false accepts in most scenarios, with specific performance gains detailed across distances up to 600m and UAV images.

Common and important applications of person identification occur at distances and viewpoints in which the face is not visible or is not sufficiently resolved to be useful. We examine body shape as a biometric across distance and viewpoint variation. We propose an approach that combines standard object classification networks with representations based on linguistic (word-based) descriptions of bodies. Algorithms with and without linguistic training were compared on their ability to identify people from body shape in images captured across a large range of distances/views (close-range, 100m, 200m, 270m, 300m, 370m, 400m, 490m, 500m, 600m, and at elevated pitch in images taken by an unmanned aerial vehicle [UAV]). Accuracy, as measured by identity-match ranking and false accept errors in an open-set test, was surprisingly good. For identity-ranking, linguistic models were more accurate for close-range images, whereas non-linguistic models fared better at intermediary distances. Fusion of the linguistic and non-linguistic embeddings improved performance at all, but the farthest distance. Although the non-linguistic model yielded fewer false accepts at all distances, fusion of the linguistic and non-linguistic models decreased false accepts for all, but the UAV images. We conclude that linguistic and non-linguistic representations of body shape can offer complementary identity information for bodies that can improve identification in applications of interest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes