Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences
This work addresses the need for better diagnostic tools for glaucoma patients by providing a method to visualize disease progression, though it is incremental as it builds on existing diffusion models for medical imaging.
The paper tackled the problem of predicting glaucoma progression by generating prospective fundus images from irregular longitudinal sequences using a diffusion model, resulting in improved precision for downstream classification tasks.
The utilization of longitudinal datasets for glaucoma progression prediction offers a compelling approach to support early therapeutic interventions. Predominant methodologies in this domain have primarily focused on the direct prediction of glaucoma stage labels from longitudinal datasets. However, such methods may not adequately encapsulate the nuanced developmental trajectory of the disease. To enhance the diagnostic acumen of medical practitioners, we propose a novel diffusion-based model to predict prospective images by extrapolating from existing longitudinal fundus images of patients. The methodology delineated in this study distinctively leverages sequences of images as inputs. Subsequently, a time-aligned mask is employed to select a specific year for image generation. During the training phase, the time-aligned mask resolves the issue of irregular temporal intervals in longitudinal image sequence sampling. Additionally, we utilize a strategy of randomly masking a frame in the sequence to establish the ground truth. This methodology aids the network in continuously acquiring knowledge regarding the internal relationships among the sequences throughout the learning phase. Moreover, the introduction of textual labels is instrumental in categorizing images generated within the sequence. The empirical findings from the conducted experiments indicate that our proposed model not only effectively generates longitudinal data but also significantly improves the precision of downstream classification tasks.