CVDec 14, 2015

Understanding Human-Centric Images: From Geometry to Fashion

arXiv:1604.08164v1
Originality Incremental advance
AI Analysis

It addresses the problem of human-centric image analysis for applications like fashion advice, but is incremental as it builds on existing methods.

This thesis developed a hierarchy of tools for understanding humans from monocular RGB images, from low-level descriptors to high-level fashion models, and showed significant improvements across tasks like 3D pose estimation and fashion prediction.

Understanding humans from photographs has always been a fundamental goal of computer vision. In this thesis we have developed a hierarchy of tools that cover a wide range of topics with the objective of understanding humans from monocular RGB image: from low level feature point descriptors to high level fashion-aware conditional random fields models. In order to build these high level models it is paramount to have a battery of robust and reliable low and mid level cues. Along these lines, we have proposed two low-level keypoint descriptors: one based on the theory of the heat diffusion on images, and the other that uses a convolutional neural network to learn discriminative image patch representations. We also introduce distinct low-level generative models for representing human pose: in particular we present a discrete model based on a directed acyclic graph and a continuous model that consists of poses clustered on a Riemannian manifold. As mid level cues we propose two 3D human pose estimation algorithms: one that estimates the 3D pose given a noisy 2D estimation, and an approach that simultaneously estimates both the 2D and 3D pose. Finally, we formulate higher level models built upon low and mid level cues for understanding humans from single images. Concretely, we focus on two different tasks in the context of fashion: semantic segmentation of clothing, and predicting the fashionability from images with metadata to ultimately provide fashion advice to the user. For all presented approaches we present extensive results and comparisons against the state-of-the-art and show significant improvements on the entire variety of tasks we tackle.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes