CVAIGRApr 8, 2021

Generative Landmarks

arXiv:2104.04055v1
AI Analysis

This addresses the issue of inconsistent annotations and lack of personalization in landmark detection for applications like face and hand tracking, though it appears incremental as it builds on existing GAN and image translation techniques.

The paper tackles the problem of sparse landmark detection by posing it as an image translation task, using unpaired marked and unmarked videos with a generative adversarial network and cyclic consistency to predict deformations of landmark templates, resulting in a method that is temporally consistent, image class agnostic, and does not rely on manually labelled priors.

We propose a general purpose approach to detect landmarks with improved temporal consistency, and personalization. Most sparse landmark detection methods rely on laborious, manually labelled landmarks, where inconsistency in annotations over a temporal volume leads to sub-optimal landmark learning. Further, high-quality landmarks with personalization is often hard to achieve. We pose landmark detection as an image translation problem. We capture two sets of unpaired marked (with paint) and unmarked videos. We then use a generative adversarial network and cyclic consistency to predict deformations of landmark templates that simulate markers on unmarked images until these images are indistinguishable from ground-truth marked images. Our novel method does not rely on manually labelled priors, is temporally consistent, and image class agnostic -- face, and hand landmarks detection examples are shown.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes