CVMar 15, 2022

Multi-Curve Translator for High-Resolution Photorealistic Image Translation

arXiv:2203.07756v26 citationsh-index: 6
AI Analysis

This addresses efficiency issues for researchers and practitioners in computer vision working with high-resolution image translation, though it is incremental as it builds on existing base models.

The paper tackles the high computational cost of fully convolutional networks for high-resolution photorealistic image-to-image translation by proposing the Multi-Curve Translator (MCT), which processes downsampled images to predict full-resolution outputs, enabling real-time processing of 4K images with comparable or better performance than base models.

The dominant image-to-image translation methods are based on fully convolutional networks, which extract and translate an image's features and then reconstruct the image. However, they have unacceptable computational costs when working with high-resolution images. To this end, we present the Multi-Curve Translator (MCT), which not only predicts the translated pixels for the corresponding input pixels but also for their neighboring pixels. And if a high-resolution image is downsampled to its low-resolution version, the lost pixels are the remaining pixels' neighboring pixels. So MCT makes it possible to feed the network only the downsampled image to perform the mapping for the full-resolution image, which can dramatically lower the computational cost. Besides, MCT is a plug-in approach that utilizes existing base models and requires only replacing their output layers. Experiments demonstrate that the MCT variants can process 4K images in real-time and achieve comparable or even better performance than the base models on various photorealistic image-to-image translation tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes