Wenbin Wei

LG
h-index14
6papers
172citations
Novelty58%
AI Score49

6 Papers

LGOct 16, 2023Code
FATE-LLM: A Industrial Grade Federated Learning Framework for Large Language Models

Tao Fan, Yan Kang, Guoqiang Ma et al.

Large Language Models (LLMs), such as ChatGPT, LLaMA, GLM, and PaLM, have exhibited remarkable performances across various tasks in recent years. However, LLMs face two main challenges in real-world applications. One challenge is that training LLMs consumes vast computing resources, preventing LLMs from being adopted by small and medium-sized enterprises with limited computing resources. Another is that training LLM requires a large amount of high-quality data, which are often scattered among enterprises. To address these challenges, we propose FATE-LLM, an industrial-grade federated learning framework for large language models. FATE-LLM (1) facilitates federated learning for large language models (coined FedLLM); (2) promotes efficient training of FedLLM using parameter-efficient fine-tuning methods; (3) protects the intellectual property of LLMs; (4) preserves data privacy during training and inference through privacy-preserving mechanisms. We release the code of FATE-LLM at https://github.com/FederatedAI/FATE-LLM to facilitate the research of FedLLM and enable a broad range of industrial applications.

CVJun 21, 2023Code
OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

Weihao Gao, Zhuo Deng, Zhiyuan Niu et al.

Large multimodal language models (LMMs) have achieved significant success in general domains. However, due to the significant differences between medical images and text and general web content, the performance of LMMs in medical scenarios is limited. In ophthalmology, clinical diagnosis relies on multiple modalities of medical images, but unfortunately, multimodal ophthalmic large language models have not been explored to date. In this paper, we study and construct an ophthalmic large multimodal model. Firstly, we use fundus images as an entry point to build a disease assessment and diagnosis pipeline to achieve common ophthalmic disease diagnosis and lesion segmentation. Then, we establish a new ophthalmic multimodal instruction-following and dialogue fine-tuning dataset based on disease-related knowledge data and publicly available real-world medical dialogue. We introduce visual ability into the large language model to complete the ophthalmic large language and vision assistant (OphGLM). Our experimental results demonstrate that the OphGLM model performs exceptionally well, and it has the potential to revolutionize clinical applications in ophthalmology. The dataset, code, and models will be made publicly available at https://github.com/ML-AILab/OphGLM.

29.8LGMay 8
SGC-RML: A reliable and interpretable longitudinal assessment for PD in real-world DNS

Wenbin Wei, Ruixiang Gao, Suyuan Yao et al.

Real-world digital Parkinson's disease assessment faces challenges such as heterogeneous modalities, cross-device bias, and incomplete labeling. Existing methods often focus on average predictive performance, lacking the reliability mechanisms needed for retrospective reliability-aware assessment - namely, determining when the model is reliable, when to reject an assessment, when to retest, and from which symptom dimensions the predictions are based. This paper proposes SGC-RML, which maps speech, gait, wearable motion, mobility tasks, and clinical variables to a shared 8-dimensional symptom node space (7 clinical symptom nodes and 1 reliability_state auxiliary node), unifying motor and non-motor representations through a symptom atlas. By jointly introducing uncertainty estimation, conformal calibration, and selective decision routing, the model can not only predict symptoms and severity but also reject assessments or suggest retests when evidence is insufficient. We validate this framework on five real-world PD datasets, covering classification, regression, event detection, and longitudinal severity prediction. Experiments show that SGC-RML achieves an MAE of 4.579 / R^2 of 0.772 on PPMI, an AUC of 0.953 on mPower, and an AUC of 0.825 on PADS. Under leak-free temporal anchoring, as few as 5 subject-specific anchors transform UCI from an essentially non-predictive subject-independent setting (motor MAE 8.38, CCC 0.02) into a calibrated longitudinal assessment (motor MAE 3.24, CCC 0.756) with split-conformal coverage held at the 0.80 target. Under the Daphnet LOSO protocol, it achieves an F1 of 0.803 / AUC of 0.872. These results demonstrate that SGC-RML provides a unified paradigm for accurate, calibrated, auditable, and symptom-interpretable retrospective longitudinal assessment of PD under incomplete multimodal conditions.

LGMar 8, 2024
A Concept-based Interpretable Model for the Diagnosis of Choroid Neoplasias using Multimodal Data

Yifan Wu, Yang Liu, Yue Yang et al.

Diagnosing rare diseases presents a common challenge in clinical practice, necessitating the expertise of specialists for accurate identification. The advent of machine learning offers a promising solution, while the development of such technologies is hindered by the scarcity of data on rare conditions and the demand for models that are both interpretable and trustworthy in a clinical context. Interpretable AI, with its capacity for human-readable outputs, can facilitate validation by clinicians and contribute to medical education. In the current work, we focus on choroid neoplasias, the most prevalent form of eye cancer in adults, albeit rare with 5.1 per million. We built the so-far largest dataset consisting of 750 patients, incorporating three distinct imaging modalities collected from 2004 to 2022. Our work introduces a concept-based interpretable model that distinguishes between three types of choroidal tumors, integrating insights from domain experts via radiological reports. Remarkably, this model not only achieves an F1 score of 0.91, rivaling that of black-box models, but also boosts the diagnostic accuracy of junior doctors by 42%. This study highlights the significant potential of interpretable machine learning in improving the diagnosis of rare diseases, laying a groundwork for future breakthroughs in medical AI that could tackle a wider array of complex health scenarios.

CVJan 26
Fair-Eye Net: A Fair, Trustworthy, Multimodal Integrated Glaucoma Full Chain AI System

Wenbin Wei, Suyuan Yao, Cheng Huang et al.

Glaucoma is a top cause of irreversible blindness globally, making early detection and longitudinal follow-up pivotal to preventing permanent vision loss. Current screening and progression assessment, however, rely on single tests or loosely linked examinations, introducing subjectivity and fragmented care. Limited access to high-quality imaging tools and specialist expertise further compromises consistency and equity in real-world use. To address these gaps, we developed Fair-Eye Net, a fair, reliable multimodal AI system closing the clinical loop from glaucoma screening to follow-up and risk alerting. It integrates fundus photos, OCT structural metrics, VF functional indices, and demographic factors via a dual-stream heterogeneous fusion architecture, with an uncertainty-aware hierarchical gating strategy for selective prediction and safe referral. A fairness constraint reduces missed diagnoses in disadvantaged subgroups. Experimental results show it achieved an AUC of 0.912 (96.7% specificity), cut racial false-negativity disparity by 73.4% (12.31% to 3.28%), maintained stable cross-domain performance, and enabled 3-12 months of early risk alerts (92% sensitivity, 88% specificity). Unlike post hoc fairness adjustments, Fair-Eye Net optimizes fairness as a primary goal with clinical reliability via multitask learning, offering a reproducible path for clinical translation and large-scale deployment to advance global eye health equity.

IVNov 19, 2024
Versatile Cataract Fundus Image Restoration Model Utilizing Unpaired Cataract and High-quality Images

Zheng Gong, Zhuo Deng, Weihao Gao et al.

Cataract is one of the most common blinding eye diseases and can be treated by surgery. However, because cataract patients may also suffer from other blinding eye diseases, ophthalmologists must diagnose them before surgery. The cloudy lens of cataract patients forms a hazy degeneration in the fundus images, making it challenging to observe the patient's fundus vessels, which brings difficulties to the diagnosis process. To address this issue, this paper establishes a new cataract image restoration method named Catintell. It contains a cataract image synthesizing model, Catintell-Syn, and a restoration model, Catintell-Res. Catintell-Syn uses GAN architecture with fully unsupervised data to generate paired cataract-like images with realistic style and texture rather than the conventional Gaussian degradation algorithm. Meanwhile, Catintell-Res is an image restoration network that can improve the quality of real cataract fundus images using the knowledge learned from synthetic cataract images. Extensive experiments show that Catintell-Res outperforms other cataract image restoration methods in PSNR with 39.03 and SSIM with 0.9476. Furthermore, the universal restoration ability that Catintell-Res gained from unpaired cataract images can process cataract images from various datasets. We hope the models can help ophthalmologists identify other blinding eye diseases of cataract patients and inspire more medical image restoration methods in the future.