Nan Mu

CV
h-index17
6papers
4citations
Novelty49%
AI Score44

6 Papers

4.8IVNov 17, 2022
Parameter-Efficient Transformer with Hybrid Axial-Attention for Medical Image Segmentation

Yiyue Hu, Lei Zhang, Nan Mu et al.

Transformers have achieved remarkable success in medical image analysis owing to their powerful capability to use flexible self-attention mechanism. However, due to lacking intrinsic inductive bias in modeling visual structural information, they generally require a large-scale pre-training schedule, limiting the clinical applications over expensive small-scale medical data. To this end, we propose a parameter-efficient transformer to explore intrinsic inductive bias via position information for medical image segmentation. Specifically, we empirically investigate how different position encoding strategies affect the prediction quality of the region of interest (ROI), and observe that ROIs are sensitive to the position encoding strategies. Motivated by this, we present a novel Hybrid Axial-Attention (HAA), a form of position self-attention that can be equipped with spatial pixel-wise information and relative position information as inductive bias. Moreover, we introduce a gating mechanism to alleviate the burden of training schedule, resulting in efficient feature selection over small-scale datasets. Experiments on the BraTS and Covid19 datasets prove the superiority of our method over the baseline and previous works. Internal workflow visualization with interpretability is conducted to better validate our success.

1.9CLNov 1, 2024Code
PrefRAG: Preference-Driven Multi-Source Retrieval Augmented Generation

Qingfei Zhao, Ruobing Wang, Yukuo Cen et al. · tsinghua

Retrieval-Augmented Generation (RAG) has emerged as a reliable external knowledge augmentation technique to mitigate hallucination issues and parameterized knowledge limitations in Large Language Models (LLMs). Existing adaptive RAG (ARAG) systems excel at in-depth exploration within a single source but struggle to effectively and controllably explore different retrieval sources, as they fail to foresee their internal knowledge features. We develop a novel multi-source ARAG system, PrefRAG, which enhances RAG by enabling in-depth and controllable exploration of diverse retrieval sources through preference-driven adaptive retrieval and self-reflection. PrefRAG first fully explores controllable local sources in adaptive retrieval and supplements with the web when appropriate, ultimately selecting the optimal source for knowledge observation. Subsequently, PrefRAG feeds answer quality feedback into the retrieval process, optimizing it from the generation perspective to produce higher-quality responses. Extensive experiments confirm its superiority, high retrieval efficiency, and knowledge controllability. PrefRAG outperforms Vanilla RAG and the leading MS-ARAG by up to 25.6% and 13.9% respectively. Additionally, PrefRAG trained with DPO achieves higher performance. The code and data are available at https://github.com/QingFei1/PrefRAG.git.

3.6CVNov 28, 2025
Convolutional Feature Noise Reduction for 2D Cardiac MR Image Segmentation

Hong Zheng, Nan Mu, Han Su et al.

Noise reduction constitutes a crucial operation within Digital Signal Processing. Regrettably, it frequently remains neglected when dealing with the processing of convolutional features in segmentation networks. This oversight could trigger the butterfly effect, impairing the subsequent outcomes within the entire feature system. To complete this void, we consider convolutional features following Gaussian distributions as feature signal matrices and then present a simple and effective feature filter in this study. The proposed filter is fundamentally a low-amplitude pass filter primarily aimed at minimizing noise in feature signal inputs and is named Convolutional Feature Filter (CFF). We conducted experiments on two established 2D segmentation networks and two public cardiac MR image datasets to validate the effectiveness of the CFF, and the experimental findings demonstrated a decrease in noise within the feature signal matrices. To enable a numerical observation and analysis of this reduction, we developed a binarization equation to calculate the information entropy of feature signals.

3.6CVSep 10, 2025
SFD-Mamba2Net: Structure-Guided Frequency-Enhanced Dual-Stream Mamba2 Network for Coronary Artery Segmentation

Nan Mu, Ruiqi Song, Zhihui Xu et al.

Background: Coronary Artery Disease (CAD) is one of the leading causes of death worldwide. Invasive Coronary Angiography (ICA), regarded as the gold standard for CAD diagnosis, necessitates precise vessel segmentation and stenosis detection. However, ICA images are typically characterized by low contrast, high noise levels, and complex, fine-grained vascular structures, which pose significant challenges to the clinical adoption of existing segmentation and detection methods. Objective: This study aims to improve the accuracy of coronary artery segmentation and stenosis detection in ICA images by integrating multi-scale structural priors, state-space-based long-range dependency modeling, and frequency-domain detail enhancement strategies. Methods: We propose SFD-Mamba2Net, an end-to-end framework tailored for ICA-based vascular segmentation and stenosis detection. In the encoder, a Curvature-Aware Structural Enhancement (CASE) module is embedded to leverage multi-scale responses for highlighting slender tubular vascular structures, suppressing background interference, and directing attention toward vascular regions. In the decoder, we introduce a Progressive High-Frequency Perception (PHFP) module that employs multi-level wavelet decomposition to progressively refine high-frequency details while integrating low-frequency global structures. Results and Conclusions: SFD-Mamba2Net consistently outperformed state-of-the-art methods across eight segmentation metrics, and achieved the highest true positive rate and positive predictive value in stenosis detection.

4.1LGJun 20, 2025
An Uncertainty-Aware Dynamic Decision Framework for Progressive Multi-Omics Integration in Classification Tasks

Nan Mu, Hongbo Yang, Chen Zhao

Background and Objective: High-throughput multi-omics technologies have proven invaluable for elucidating disease mechanisms and enabling early diagnosis. However, the high cost of multi-omics profiling imposes a significant economic burden, with over reliance on full omics data potentially leading to unnecessary resource consumption. To address these issues, we propose an uncertainty-aware, multi-view dynamic decision framework for omics data classification that aims to achieve high diagnostic accuracy while minimizing testing costs. Methodology: At the single-omics level, we refine the activation functions of neural networks to generate Dirichlet distribution parameters, utilizing subjective logic to quantify both the belief masses and uncertainty mass of classification results. Belief mass reflects the support of a specific omics modality for a disease class, while the uncertainty parameter captures limitations in data quality and model discriminability, providing a more trustworthy basis for decision-making. At the multi omics level, we employ a fusion strategy based on Dempster-Shafer theory to integrate heterogeneous modalities, leveraging their complementarity to boost diagnostic accuracy and robustness. A dynamic decision mechanism is then applied that omics data are incrementally introduced for each patient until either all data sources are utilized or the model confidence exceeds a predefined threshold, potentially before all data sources are utilized. Results and Conclusion: We evaluate our approach on four benchmark multi-omics datasets, ROSMAP, LGG, BRCA, and KIPAN. In three datasets, over 50% of cases achieved accurate classification using a single omics modality, effectively reducing redundant testing. Meanwhile, our method maintains diagnostic performance comparable to full-omics models and preserves essential biological insights.

5.1IVJun 13, 2025
FAD-Net: Frequency-Domain Attention-Guided Diffusion Network for Coronary Artery Segmentation using Invasive Coronary Angiography

Nan Mu, Ruiqi Song, Xiaoning Li et al.

Background: Coronary artery disease (CAD) remains one of the leading causes of mortality worldwide. Precise segmentation of coronary arteries from invasive coronary angiography (ICA) is critical for effective clinical decision-making. Objective: This study aims to propose a novel deep learning model based on frequency-domain analysis to enhance the accuracy of coronary artery segmentation and stenosis detection in ICA, thereby offering robust support for the stenosis detection and treatment of CAD. Methods: We propose the Frequency-Domain Attention-Guided Diffusion Network (FAD-Net), which integrates a frequency-domain-based attention mechanism and a cascading diffusion strategy to fully exploit frequency-domain information for improved segmentation accuracy. Specifically, FAD-Net employs a Multi-Level Self-Attention (MLSA) mechanism in the frequency domain, computing the similarity between queries and keys across high- and low-frequency components in ICAs. Furthermore, a Low-Frequency Diffusion Module (LFDM) is incorporated to decompose ICAs into low- and high-frequency components via multi-level wavelet transformation. Subsequently, it refines fine-grained arterial branches and edges by reintegrating high-frequency details via inverse fusion, enabling continuous enhancement of anatomical precision. Results and Conclusions: Extensive experiments demonstrate that FAD-Net achieves a mean Dice coefficient of 0.8717 in coronary artery segmentation, outperforming existing state-of-the-art methods. In addition, it attains a true positive rate of 0.6140 and a positive predictive value of 0.6398 in stenosis detection, underscoring its clinical applicability. These findings suggest that FAD-Net holds significant potential to assist in the accurate diagnosis and treatment planning of CAD.