CVJun 10, 2025

MIRAGE: Multimodal foundation model and benchmark for comprehensive retinal OCT image analysis

arXiv:2506.08900v314 citationsh-index: 16Has Codenpj Digital Medicine
Originality Incremental advance
AI Analysis

This work addresses the need for robust AI systems in retinal OCT image analysis for clinicians, though it appears incremental as it builds on existing foundation model concepts with multimodal integration.

The authors tackled the problem of limited validation and single-modality focus in foundation models for ophthalmology by proposing MIRAGE, a multimodal foundation model for OCT and SLO image analysis, which outperformed existing methods in classification and segmentation tasks.

Artificial intelligence (AI) has become a fundamental tool for assisting clinicians in analyzing ophthalmic images, such as optical coherence tomography (OCT). However, developing AI models often requires extensive annotation, and existing models tend to underperform on independent, unseen data. Foundation models (FMs), large AI models trained on vast unlabeled datasets, have shown promise in overcoming these challenges. Nonetheless, available FMs for ophthalmology lack extensive validation, especially for segmentation tasks, and focus on a single imaging modality. In this context, we propose MIRAGE, a novel multimodal FM for the analysis of OCT and scanning laser ophthalmoscopy (SLO) images. Additionally, we propose a new evaluation benchmark with OCT/SLO classification and segmentation tasks. The comparison with general and specialized FMs and segmentation methods shows the superiority of MIRAGE in both types of tasks, highlighting its suitability as a basis for the development of robust AI systems for retinal OCT image analysis. Both MIRAGE and the evaluation benchmark are publicly available: https://github.com/j-morano/MIRAGE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes