Contrastive Monotonic Pixel-Level Modulation
This addresses a critical gap in neural image translation and low-level vision by enabling precise spatial modulation, offering new solutions for tasks like low-light enhancement and noise generation beyond traditional one-to-one methods.
The paper tackles the problem of continuous one-to-many mapping in vision tasks by introducing MonoPix, an unsupervised contrastive model with pixel-level spatial control, achieving state-of-the-art performance on tasks like AFHQ cat-dog and Yosemite summer-winter translation.
Continuous one-to-many mapping is a less investigated yet important task in both low-level visions and neural image translation. In this paper, we present a new formulation called MonoPix, an unsupervised and contrastive continuous modulation model, and take a step further to enable a pixel-level spatial control which is critical but can not be properly handled previously. The key feature of this work is to model the monotonicity between controlling signals and the domain discriminator with a novel contrastive modulation framework and corresponding monotonicity constraints. We have also introduced a selective inference strategy with logarithmic approximation complexity and support fast domain adaptations. The state-of-the-art performance is validated on a variety of continuous mapping tasks, including AFHQ cat-dog and Yosemite summer-winter translation. The introduced approach also helps to provide a new solution for many low-level tasks like low-light enhancement and natural noise generation, which is beyond the long-established practice of one-to-one training and inference. Code is available at https://github.com/lukun199/MonoPix.