A Vessel-Segmentation-Based CycleGAN for Unpaired Multi-modal Retinal Image Synthesis
This work addresses the need for efficient training data augmentation in retinal image analysis, though it is incremental as it builds on existing CycleGAN and segmentation techniques.
The paper tackled the problem of generating realistic multi-modal retinal images for training registration methods by integrating a vessel segmentation network into a CycleGAN framework, resulting in visually realistic images that preserve vessel structures.
Unpaired image-to-image translation of retinal images can efficiently increase the training dataset for deep-learning-based multi-modal retinal registration methods. Our method integrates a vessel segmentation network into the image-to-image translation task by extending the CycleGAN framework. The segmentation network is inserted prior to a UNet vision transformer generator network and serves as a shared representation between both domains. We reformulate the original identity loss to learn the direct mapping between the vessel segmentation and the real image. Additionally, we add a segmentation loss term to ensure shared vessel locations between fake and real images. In the experiments, our method shows a visually realistic look and preserves the vessel structures, which is a prerequisite for generating multi-modal training data for image registration.