IVCVDec 12, 2022

CTT-Net: A Multi-view Cross-token Transformer for Cataract Postoperative Visual Acuity Prediction

arXiv:2212.05794v11 citationsh-index: 25Has Code
Originality Incremental advance
AI Analysis

This work addresses a critical clinical need for ophthalmologists to assess surgery necessity, but it is incremental as it builds on prior deep learning approaches with specific enhancements.

The paper tackles the problem of predicting postoperative visual acuity for cataract patients by analyzing multi-view OCT images and preoperative VA, achieving improved performance over existing methods across various metrics.

Surgery is the only viable treatment for cataract patients with visual acuity (VA) impairment. Clinically, to assess the necessity of cataract surgery, accurately predicting postoperative VA before surgery by analyzing multi-view optical coherence tomography (OCT) images is crucially needed. Unfortunately, due to complicated fundus conditions, determining postoperative VA remains difficult for medical experts. Deep learning methods for this problem were developed in recent years. Although effective, these methods still face several issues, such as not efficiently exploring potential relations between multi-view OCT images, neglecting the key role of clinical prior knowledge (e.g., preoperative VA value), and using only regression-based metrics which are lacking reference. In this paper, we propose a novel Cross-token Transformer Network (CTT-Net) for postoperative VA prediction by analyzing both the multi-view OCT images and preoperative VA. To effectively fuse multi-view features of OCT images, we develop cross-token attention that could restrict redundant/unnecessary attention flow. Further, we utilize the preoperative VA value to provide more information for postoperative VA prediction and facilitate fusion between views. Moreover, we design an auxiliary classification loss to improve model performance and assess VA recovery more sufficiently, avoiding the limitation by only using the regression metrics. To evaluate CTT-Net, we build a multi-view OCT image dataset collected from our collaborative hospital. A set of extensive experiments validate the effectiveness of our model compared to existing methods in various metrics. Code is available at: https://github.com/wjh892521292/Cataract OCT.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes