CV LGMar 6, 2018

The Contextual Loss for Image Transformation with Non-Aligned Data

Roey Mechrez, Itamar Talmi, Lihi Zelnik-Manor

arXiv:1803.02077v433.7416 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a key bottleneck in image transformation tasks for computer vision researchers and practitioners by enabling training on non-aligned data, though it is incremental as it builds on existing loss function paradigms.

The paper tackles the problem of image transformation when aligned training pairs are unavailable by introducing a loss function that compares semantically similar regions without requiring spatial alignment, achieving effective style transfer between non-aligned images such as mapping eyes-to-eyes and mouth-to-mouth.

Feed-forward CNNs trained for image transformation problems rely on loss functions that measure the similarity between the generated image and a target image. Most of the common loss functions assume that these images are spatially aligned and compare pixels at corresponding locations. However, for many tasks, aligned training pairs of images will not be available. We present an alternative loss function that does not require alignment, thus providing an effective and simple solution for a new space of problems. Our loss is based on both context and semantics -- it compares regions with similar semantic meaning, while considering the context of the entire image. Hence, for example, when transferring the style of one face to another, it will translate eyes-to-eyes and mouth-to-mouth. Our code can be found at https://www.github.com/roimehrez/contextualLoss

View on arXiv PDF Code

Similar