CVJun 4, 2019

Cross-Domain Cascaded Deep Feature Translation

arXiv:1906.01526v113 citations
Originality Incremental advance
AI Analysis

This addresses the problem of shape translation for computer vision applications, but it is incremental as it builds on existing deep learning and adversarial training approaches.

The paper tackles the challenge of shape translation in unpaired image-to-image translation by using deep features from a pre-trained VGG network in a cascaded manner, achieving improved performance over state-of-the-art methods in domains with significantly different shapes.

In recent years we have witnessed tremendous progress in unpaired image-to-image translation methods, propelled by the emergence of DNNs and adversarial training strategies. However, most existing methods focus on transfer of style and appearance, rather than on shape translation. The latter task is challenging, due to its intricate non-local nature, which calls for additional supervision. We mitigate this by descending the deep layers of a pre-trained network, where the deep features contain more semantics, and applying the translation from and between these deep features. Specifically, we leverage VGG, which is a classification network, pre-trained with large-scale semantic supervision. Our translation is performed in a cascaded, deep-to-shallow, fashion, along the deep feature hierarchy: we first translate between the deepest layers that encode the higher-level semantic content of the image, proceeding to translate the shallower layers, conditioned on the deeper ones. We show that our method is able to translate between different domains, which exhibit significantly different shapes. We evaluate our method both qualitatively and quantitatively and compare it to state-of-the-art image-to-image translation methods. Our code and trained models will be made available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes