ELF: An End-to-end Local and Global Multimodal Fusion Framework for Glaucoma Grading
This work addresses early detection of glaucoma, a chronic eye disease, by improving multimodal fusion for medical imaging, though it appears incremental in its approach.
The paper tackles glaucoma grading by proposing an end-to-end multimodal fusion framework (ELF) that integrates 2D fundus images and 3D OCT data, achieving state-of-the-art results on the GAMMA dataset.
Glaucoma is a chronic neurodegenerative condition that can lead to blindness. Early detection and curing are very important in stopping the disease from getting worse for glaucoma patients. The 2D fundus images and optical coherence tomography(OCT) are useful for ophthalmologists in diagnosing glaucoma. There are many methods based on the fundus images or 3D OCT volumes; however, the mining for multi-modality, including both fundus images and data, is less studied. In this work, we propose an end-to-end local and global multi-modal fusion framework for glaucoma grading, named ELF for short. ELF can fully utilize the complementary information between fundus and OCT. In addition, unlike previous methods that concatenate the multi-modal features together, which lack exploring the mutual information between different modalities, ELF can take advantage of local-wise and global-wise mutual information. The extensive experiment conducted on the multi-modal glaucoma grading GAMMA dataset can prove the effiectness of ELF when compared with other state-of-the-art methods.