IVFeb 16, 2023
A Review of Uncertainty Estimation and its Application in Medical ImagingKe Zou, Zhihao Chen, Xuedong Yuan et al.
The use of AI systems in healthcare for the early screening of diseases is of great clinical importance. Deep learning has shown great promise in medical imaging, but the reliability and trustworthiness of AI systems limit their deployment in real clinical scenes, where patient safety is at stake. Uncertainty estimation plays a pivotal role in producing a confidence evaluation along with the prediction of the deep model. This is particularly important in medical imaging, where the uncertainty in the model's predictions can be used to identify areas of concern or to provide additional information to the clinician. In this paper, we review the various types of uncertainty in deep learning, including aleatoric uncertainty and epistemic uncertainty. We further discuss how they can be estimated in medical imaging. More importantly, we review recent advances in deep learning models that incorporate uncertainty estimation in medical imaging. Finally, we discuss the challenges and future directions in uncertainty estimation in deep learning for medical imaging. We hope this review will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of uncertainty estimation models in medical imaging.
IVJun 19, 2022
TBraTS: Trusted Brain Tumor SegmentationKe Zou, Xuedong Yuan, Xiaojing Shen et al.
Despite recent improvements in the accuracy of brain tumor segmentation, the results still exhibit low levels of confidence and robustness. Uncertainty estimation is one effective way to change this situation, as it provides a measure of confidence in the segmentation results. In this paper, we propose a trusted brain tumor segmentation network which can generate robust segmentation results and reliable uncertainty estimations without excessive computational burden and modification of the backbone network. In our method, uncertainty is modeled explicitly using subjective logic theory, which treats the predictions of backbone neural network as subjective opinions by parameterizing the class probabilities of the segmentation as a Dirichlet distribution. Meanwhile, the trusted segmentation framework learns the function that gathers reliable evidence from the feature leading to the final segmentation results. Overall, our unified trusted segmentation framework endows the model with reliability and robustness to out-of-distribution samples. To evaluate the effectiveness of our model in robustness and reliability, qualitative and quantitative experiments are conducted on the BraTS 2019 dataset.
AIAug 18, 2024Code
PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image UnderstandingDawei Dai, Yuanhui Zhang, Long Xu et al.
The previous advancements in pathology image understanding primarily involved developing models tailored to specific tasks. Recent studies has demonstrated that the large vision-language model can enhance the performance of various downstream tasks in medical image understanding. In this study, we developed a domain-specific large language-vision assistant (PA-LLaVA) for pathology image understanding. Specifically, (1) we first construct a human pathology image-text dataset by cleaning the public medical image-text data for domain-specific alignment; (2) Using the proposed image-text data, we first train a pathology language-image pretraining (PLIP) model as the specialized visual encoder for pathology image, and then we developed scale-invariant connector to avoid the information loss caused by image scaling; (3) We adopt two-stage learning to train PA-LLaVA, first stage for domain alignment, and second stage for end to end visual question \& answering (VQA) task. In experiments, we evaluate our PA-LLaVA on both supervised and zero-shot VQA datasets, our model achieved the best overall performance among multimodal models of similar scale. The ablation experiments also confirmed the effectiveness of our design. We posit that our PA-LLaVA model and the datasets presented in this work can promote research in field of computational pathology. All codes are available at: https://github.com/ddw2AIGROUP2CQUPT/PA-LLaVA}{https://github.com/ddw2AIGROUP2CQUPT/PA-LLaVA
CVMar 17, 2023
Uncertainty-informed Mutual Learning for Joint Medical Image Classification and SegmentationKai Ren, Ke Zou, Xianjie Liu et al.
Classification and segmentation are crucial in medical image analysis as they enable accurate diagnosis and disease monitoring. However, current methods often prioritize the mutual learning features and shared model parameters, while neglecting the reliability of features and performances. In this paper, we propose a novel Uncertainty-informed Mutual Learning (UML) framework for reliable and interpretable medical image analysis. Our UML introduces reliability to joint classification and segmentation tasks, leveraging mutual learning with uncertainty to improve performance. To achieve this, we first use evidential deep learning to provide image-level and pixel-wise confidences. Then, an Uncertainty Navigator Decoder is constructed for better using mutual features and generating segmentation results. Besides, an Uncertainty Instructor is proposed to screen reliable masks for classification. Overall, UML could produce confidence estimation in features and performance for each link (classification and segmentation). The experiments on the public datasets demonstrate that our UML outperforms existing methods in terms of both accuracy and robustness. Our UML has the potential to explore the development of more reliable and explainable medical image analysis models. We will release the codes for reproduction after acceptance.
IVMar 17, 2023
Reliable Multimodality Eye Disease Screening via Mixture of Student's t DistributionsKe Zou, Tian Lin, Xuedong Yuan et al.
Multimodality eye disease screening is crucial in ophthalmology as it integrates information from diverse sources to complement their respective performances. However, the existing methods are weak in assessing the reliability of each unimodality, and directly fusing an unreliable modality may cause screening errors. To address this issue, we introduce a novel multimodality evidential fusion pipeline for eye disease screening, EyeMoSt, which provides a measure of confidence for unimodality and elegantly integrates the multimodality information from a multi-distribution fusion perspective. Specifically, our model estimates both local uncertainty for unimodality and global uncertainty for the fusion modality to produce reliable classification results. More importantly, the proposed mixture of Student's $t$ distributions adaptively integrates different modalities to endow the model with heavy-tailed properties, increasing robustness and reliability. Our experimental findings on both public and in-house datasets show that our model is more reliable than current methods. Additionally, EyeMost has the potential ability to serve as a data quality discriminator, enabling reliable decision-making for multimodality eye disease screening.
LGAug 24, 2022
Seamless Tracking of Group Targets and Ungrouped Targets Using Belief PropagationXuqi Zhang, Fanqin Meng, Haiqi Liu et al.
This paper considers the problem of tracking a large-scale number of group targets. Usually, multi-target in most tracking scenarios are assumed to have independent motion and are well-separated. However, for group target tracking (GTT), the targets within groups are closely spaced and move in a coordinated manner, the groups can split or merge, and the numbers of targets in groups may be large, which lead to more challenging data association, filtering and computation problems. Within the belief propagation (BP) framework, we propose a scalable group target belief propagation (GTBP) method by jointly inferring target existence variables, group structure, data association and target states. The method can efficiently calculate the approximations of the marginal posterior distributions of these variables by performing belief propagation on the devised factor graph. As a consequence, GTBP is capable of capturing the changes in group structure, e.g., group splitting and merging. Furthermore, we model the evolution of targets as the co-action of the group or single-target motions specified by the possible group structures and corresponding probabilities. This flexible modeling enables seamless and simultaneous tracking of multiple group targets and ungrouped targets. Particularly, GTBP has excellent scalability and low computational complexity. It not only maintains the same scalability as BP, i.e., scaling linearly in the number of sensor measurements and quadratically in the number of targets, but also only scales linearly in the number of preserved group partitions. Finally, numerical experiments are presented to demonstrate the effectiveness and scalability of the proposed GTBP method.
IVJan 1, 2023
Towards Reliable Medical Image Segmentation by Modeling Evidential Calibrated UncertaintyKe Zou, Yidi Chen, Ling Huang et al.
Medical image segmentation is critical for disease diagnosis and treatment assessment. However, concerns regarding the reliability of segmentation regions persist among clinicians, mainly attributed to the absence of confidence assessment, robustness, and calibration to accuracy. To address this, we introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks. DEviS not only enhances the calibration and robustness of baseline segmentation accuracy but also provides high-efficiency uncertainty estimation for reliable predictions. By leveraging subjective logic theory, we explicitly model probability and uncertainty for medical image segmentation. Here, the Dirichlet distribution parameterizes the distribution of probabilities for different classes of the segmentation results. To generate calibrated predictions and uncertainty, we develop a trainable calibrated uncertainty penalty. Furthermore, DEviS incorporates an uncertainty-aware filtering module, which designs the metric of uncertainty-calibrated error to filter out-of-distribution data. We conducted validation studies on publicly available datasets, including ISIC2018, KiTS2021, LiTS2017, and BraTS2019, to assess the accuracy and robustness of different backbone segmentation models enhanced by DEviS, as well as the efficiency and reliability of uncertainty estimation.
30.2LGMay 18
Dynamic Elliptical Graph Factor Models via Riemannian Optimization with Geodesic Temporal RegularizationChuansen Peng, Xiaojing Shen
Inferring time-varying graph structures from high-dimensional nodal observations is a fundamental problem arising in neuroscience, finance, climatology, and beyond. Two intrinsic challenges govern this problem: maintaining the \emph{temporal coherence} of the latent graph across successive observation windows, and respecting the \emph{intrinsic Riemannian geometry} of the symmetric positive definite manifold on which precision matrices naturally reside, a curved space whose geodesic structure departs fundamentally from that of the ambient Euclidean space. In this paper we propose dynamic estimation on the Grassmann manifold with a factor model (\textsc{Degfm}), a novel algorithm that jointly addresses both challenges. We model the time-varying precision matrix sequence as a low-rank-plus-diagonal structure governed by a latent elliptical graph factor model, which drastically reduces the effective parameter count and enables reliable estimation in the challenging small-sample regime. Temporal coherence is enforced through a Riemannian geodesic penalty defined on the Grassmann manifold, ensuring that the estimated graph trajectory is smooth with respect to the intrinsic geometry rather than the ambient Euclidean space. To solve the resulting non-convex optimization problem over Grassmann-manifold-valued sequences subject to the LRaD constraint, we derive an efficient Riemannian gradient descent algorithm that respects the manifold structure at every iterate and rigorously establish its convergence to a stationary point. Extensive experiments on both synthetic benchmarks and real-world datasets demonstrate that \textsc{Degfm} consistently outperforms state-of-the-art baselines across all evaluation metrics, confirming the practical effectiveness of the proposed framework.
CVApr 10, 2024
Uncertainty-aware Medical Diagnostic Phrase Identification and GroundingKe Zou, Yang Bai, Bo Liu et al.
Medical phrase grounding is crucial for identifying relevant regions in medical images based on phrase queries, facilitating accurate image analysis and diagnosis. However, current methods rely on manual extraction of key phrases from medical reports, reducing efficiency and increasing the workload for clinicians. Additionally, the lack of model confidence estimation limits clinical trust and usability. In this paper, we introduce a novel task called Medical Report Grounding (MRG), which aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner. To address this challenge, we propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases by embedding a unique token, <BOX>, into the vocabulary to enhance detection capabilities. A vision encoder-decoder processes the embedded token and input image to generate grounding boxes. Critically, uMedGround incorporates an uncertainty-aware prediction model, significantly improving the robustness and reliability of grounding predictions. Experimental results demonstrate that uMedGround outperforms state-of-the-art medical phrase grounding methods and fine-tuned large visual-language models, validating its effectiveness and reliability. This study represents a pioneering exploration of the MRG task, marking the first-ever endeavor in this domain. Additionally, we demonstrate the applicability of uMedGround in medical visual question answering and class-based localization tasks, where it highlights visual evidence aligned with key diagnostic phrases, supporting clinicians in interpreting various types of textual inputs, including free-text reports, visual question answering queries, and class labels.
MLOct 19, 2025
Learning Time-Varying Graphs from Incomplete Graph SignalsChuansen Peng, Xiaojing Shen
This paper tackles the challenging problem of jointly inferring time-varying network topologies and imputing missing data from partially observed graph signals. We propose a unified non-convex optimization framework to simultaneously recover a sequence of graph Laplacian matrices while reconstructing the unobserved signal entries. Unlike conventional decoupled methods, our integrated approach facilitates a bidirectional flow of information between the graph and signal domains, yielding superior robustness, particularly in high missing-data regimes. To capture realistic network dynamics, we introduce a fused-lasso type regularizer on the sequence of Laplacians. This penalty promotes temporal smoothness by penalizing large successive changes, thereby preventing spurious variations induced by noise while still permitting gradual topological evolution. For solving the joint optimization problem, we develop an efficient Alternating Direction Method of Multipliers (ADMM) algorithm, which leverages the problem's structure to yield closed-form solutions for both the graph and signal subproblems. This design ensures scalability to large-scale networks and long time horizons. On the theoretical front, despite the inherent non-convexity, we establish a convergence guarantee, proving that the proposed ADMM scheme converges to a stationary point. Furthermore, we derive non-asymptotic statistical guarantees, providing high-probability error bounds for the graph estimator as a function of sample size, signal smoothness, and the intrinsic temporal variability of the graph. Extensive numerical experiments validate the approach, demonstrating that it significantly outperforms state-of-the-art baselines in both convergence speed and the joint accuracy of graph learning and signal recovery.
OCSep 21, 2025
Joint Cooperative and Non-Cooperative Localization in WSNs with Distributed Scaled Proximal ADMM AlgorithmsQiaojia Zhu, Xiaojing Shen, Haiqi Liu et al.
Cooperative and non-cooperative localization frequently arise together in wireless sensor networks, particularly when sensor positions are uncertain and targets are unable to communicate with the network. While joint processing can eliminate the delay in target estimation found in sequential approaches, it introduces complex variable coupling, posing challenges in both modeling and optimization. This paper presents a joint modeling approach that formulates cooperative and non-cooperative localization as a single optimization problem. To address the resulting coupling, we introduce auxiliary variables that enable structural decoupling and distributed computation. Building on this formulation, we develop the Scaled Proximal Alternating Direction Method of Multipliers for Joint Cooperative and Non-Cooperative Localization (SP-ADMM-JCNL). Leveraging the problem's structured design, we provide theoretical guarantees that the algorithm generates a sequence converging globally to the Karush-Kuhn-Tucker (KKT) point of the reformulated problem and further to a critical point of the original non-convex objective function, with a sublinear rate of O(1/T). Experiments on both synthetic and benchmark datasets demonstrate that SP-ADMM-JCNL achieves accurate and reliable localization performance.
LGNov 27, 2021
Towards Understanding the Impact of Model Size on Differential Private ClassificationYinchen Shen, Zhiguo Wang, Ruoyu Sun et al.
Differential privacy (DP) is an essential technique for privacy-preserving. It was found that a large model trained for privacy preserving performs worse than a smaller model (e.g. ResNet50 performs worse than ResNet18). To better understand this phenomenon, we study high dimensional DP learning from the viewpoint of generalization. Theoretically, we show that for the simple Gaussian model with even small DP noise, if the dimension is large enough, then the classification error can be as bad as the random guessing. Then we propose a feature selection method to reduce the size of the model, based on a new metric which trades off the classification accuracy and privacy preserving. Experiments on real data support our theoretical results and demonstrate the advantage of the proposed method.