CVLGFeb 29, 2024

Disentangling representations of retinal images with generative models

arXiv:2402.19186v310 citationsh-index: 6Has CodeMedical Image Anal.
Originality Incremental advance
AI Analysis

This addresses reliability issues in AI applications for ophthalmology by reducing shortcuts from technical confounders, though it is incremental as it builds on existing disentanglement methods.

The paper tackled the problem of technical factors like camera type confounding retinal fundus image analysis by introducing a population model that disentangles patient attributes from camera effects, enabling controllable and realistic image generation through a disentanglement loss based on distance correlation.

Retinal fundus images play a crucial role in the early detection of eye diseases. However, the impact of technical factors on these images can pose challenges for reliable AI applications in ophthalmology. For example, large fundus cohorts are often confounded by factors like camera type, bearing the risk of learning shortcuts rather than the causal relationships behind the image generation process. Here, we introduce a population model for retinal fundus images that effectively disentangles patient attributes from camera effects, enabling controllable and highly realistic image generation. To achieve this, we propose a disentanglement loss based on distance correlation. Through qualitative and quantitative analyses, we show that our models encode desired information in disentangled subspaces and enable controllable image generation based on the learned subspaces, demonstrating the effectiveness of our disentanglement loss. The project's code is publicly available: https://github.com/berenslab/disentangling-retinal-images.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes