CVAIIVFeb 18, 2023

When Visible-to-Thermal Facial GAN Beats Conditional Diffusion

arXiv:2302.09395v17 citationsh-index: 30
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific need in telemedicine by enabling thermal imaging without specialized sensors, though it appears incremental as it builds on existing GAN and diffusion methods.

The paper tackled the problem of generating thermal facial imagery from visible-spectrum images for telemedicine, proposing VTF-GAN which outperformed GAN baselines and a conditional diffusion model in producing high-quality thermal faces.

Thermal facial imagery offers valuable insight into physiological states such as inflammation and stress by detecting emitted radiation in the infrared spectrum, which is unseen in the visible spectra. Telemedicine applications could benefit from thermal imagery, but conventional computers are reliant on RGB cameras and lack thermal sensors. As a result, we propose the Visible-to-Thermal Facial GAN (VTF-GAN) that is specifically designed to generate high-resolution thermal faces by learning both the spatial and frequency domains of facial regions, across spectra. We compare VTF-GAN against several popular GAN baselines and the first conditional Denoising Diffusion Probabilistic Model (DDPM) for VT face translation (VTF-Diff). Results show that VTF-GAN achieves high quality, crisp, and perceptually realistic thermal faces using a combined set of patch, temperature, perceptual, and Fourier Transform losses, compared to all baselines including diffusion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes