CVMar 17, 2022

Modulated Contrast for Versatile Image Synthesis

arXiv:2203.09333v151 citationsh-index: 68Has Code
Originality Incremental advance
AI Analysis

This addresses the issue of blur and artifacts in generated images for computer vision researchers, though it appears incremental as it builds on contrastive learning methods.

The paper tackles the problem of measuring image similarity for visual generation tasks by proposing MoNCE, a metric that uses modulated contrast to learn calibrated distances, which outperforms existing metrics in image translation tasks.

Perceiving the similarity between images has been a long-standing and fundamental problem underlying various visual generation tasks. Predominant approaches measure the inter-image distance by computing pointwise absolute deviations, which tends to estimate the median of instance distributions and leads to blurs and artifacts in the generated images. This paper presents MoNCE, a versatile metric that introduces image contrast to learn a calibrated metric for the perception of multifaceted inter-image distances. Unlike vanilla contrast which indiscriminately pushes negative samples from the anchor regardless of their similarity, we propose to re-weight the pushing force of negative samples adaptively according to their similarity to the anchor, which facilitates the contrastive learning from informative negative samples. Since multiple patch-level contrastive objectives are involved in image distance measurement, we introduce optimal transport in MoNCE to modulate the pushing force of negative samples collaboratively across multiple contrastive objectives. Extensive experiments over multiple image translation tasks show that the proposed MoNCE outperforms various prevailing metrics substantially. The code is available at https://github.com/fnzhan/MoNCE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes