CVIVJul 8, 2024

Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation

arXiv:2407.06095v19 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the latency issue in SAR-to-optical image translation for remote sensing applications, offering a flexible trade-off between speed and quality, though it is incremental as it builds on existing diffusion and GAN methods.

The paper tackled the problem of slow inference in diffusion models for SAR-to-optical image translation by proposing a new training framework that combines consistency distillation and adversarial learning, resulting in a 131x speed improvement while maintaining image quality.

Synthetic Aperture Radar (SAR) provides all-weather, high-resolution imaging capabilities, but its unique imaging mechanism often requires expert interpretation, limiting its widespread applicability. Translating SAR images into more easily recognizable optical images using diffusion models helps address this challenge. However, diffusion models suffer from high latency due to numerous iterative inferences, while Generative Adversarial Networks (GANs) can achieve image translation with just a single iteration but often at the cost of image quality. To overcome these issues, we propose a new training framework for SAR-to-optical image translation that combines the strengths of both approaches. Our method employs consistency distillation to reduce iterative inference steps and integrates adversarial learning to ensure image clarity and minimize color shifts. Additionally, our approach allows for a trade-off between quality and speed, providing flexibility based on application requirements. We conducted experiments on SEN12 and GF3 datasets, performing quantitative evaluations using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Frechet Inception Distance (FID), as well as calculating the inference latency. The results demonstrate that our approach significantly improves inference speed by 131 times while maintaining the visual quality of the generated images, thus offering a robust and efficient solution for SAR-to-optical image translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes