Yong Fan

h-index17

39papers

3,780citations

Novelty46%

AI Score34

Ranked #130,322 of 201,326 authors (top 65%)#41,107 in CV (top 70%)

39 Papers

IVJul 6, 2022Code

Learning Apparent Diffusion Coefficient Maps from Accelerated Radial k-Space Diffusion-Weighted MRI in Mice using a Deep CNN-Transformer Model

Yuemeng Li, Miguel Romanello Joaquim, Stephen Pickup et al.

Purpose: To accelerate radially sampled diffusion weighted spin-echo (Rad-DW-SE) acquisition method for generating high quality apparent diffusion coefficient (ADC) maps. Methods: A deep learning method was developed to generate accurate ADC maps from accelerated DWI data acquired with the Rad-DW-SE method. The deep learning method integrates convolutional neural networks (CNNs) with vision transformers to generate high quality ADC maps from accelerated DWI data, regularized by a monoexponential ADC model fitting term. A model was trained on DWI data of 147 mice and evaluated on DWI data of 36 mice, with acceleration factors of 4x and 8x compared to the original acquisition parameters. We have made our code publicly available at GitHub: https://github.com/ymli39/DeepADC-Net-Learning-Apparent-Diffusion-Coefficient-Maps, and our dataset can be downloaded at https://pennpancreaticcancerimagingresource.github.io/data.html. Results: Ablation studies and experimental results have demonstrated that the proposed deep learning model generates higher quality ADC maps from accelerated DWI data than alternative deep learning methods under comparison when their performance is quantified in whole images as well as in regions of interest, including tumors, kidneys, and muscles. Conclusions: The deep learning method with integrated CNNs and transformers provides an effective means to accurately compute ADC maps from accelerated DWI data acquired with the Rad-DW-SE method.

CVNov 17, 2023

Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation

Xiaoyang Chen, Hao Zheng, Yuemeng Li et al.

A versatile medical image segmentation model applicable to images acquired with diverse equipment and protocols can facilitate model deployment and maintenance. However, building such a model typically demands a large, diverse, and fully annotated dataset, which is challenging to obtain due to the labor-intensive nature of data curation. To address this challenge, we propose a cost-effective alternative that harnesses multi-source data with only partial or sparse segmentation labels for training, substantially reducing the cost of developing a versatile model. We devise strategies for model self-disambiguation, prior knowledge incorporation, and imbalance mitigation to tackle challenges associated with inconsistently labeled multi-source data, including label ambiguity and modality, dataset, and class imbalances. Experimental results on a multi-modal dataset compiled from eight different sources for abdominal structure segmentation have demonstrated the effectiveness and superior performance of our method compared to state-of-the-art alternative approaches. We anticipate that its cost-saving features, which optimize the utilization of existing annotated data and reduce annotation efforts for new data, will have a significant impact in the field.

IVMar 6, 2023

SurfNN: Joint Reconstruction of Multiple Cortical Surfaces from Magnetic Resonance Images

Hao Zheng, Hongming Li, Yong Fan

To achieve fast, robust, and accurate reconstruction of the human cortical surfaces from 3D magnetic resonance images (MRIs), we develop a novel deep learning-based framework, referred to as SurfNN, to reconstruct simultaneously both inner (between white matter and gray matter) and outer (pial) surfaces from MRIs. Different from existing deep learning-based cortical surface reconstruction methods that either reconstruct the cortical surfaces separately or neglect the interdependence between the inner and outer surfaces, SurfNN reconstructs both the inner and outer cortical surfaces jointly by training a single network to predict a midthickness surface that lies at the center of the inner and outer cortical surfaces. The input of SurfNN consists of a 3D MRI and an initialization of the midthickness surface that is represented both implicitly as a 3D distance map and explicitly as a triangular mesh with spherical topology, and its output includes both the inner and outer cortical surfaces, as well as the midthickness surface. The method has been evaluated on a large-scale MRI dataset and demonstrated competitive cortical surface reconstruction performance.

LGJan 27, 2023

Deep Clustering Survival Machines with Interpretable Expert Distributions

Bojian Hou, Hongming Li, Zhicheng Jiao et al.

Conventional survival analysis methods are typically ineffective to characterize heterogeneity in the population while such information can be used to assist predictive modeling. In this study, we propose a hybrid survival analysis method, referred to as deep clustering survival machines, that combines the discriminative and generative mechanisms. Similar to the mixture models, we assume that the timing information of survival data is generatively described by a mixture of certain numbers of parametric distributions, i.e., expert distributions. We learn weights of the expert distributions for individual instances according to their features discriminatively such that each instance's survival information can be characterized by a weighted combination of the learned constant expert distributions. This method also facilitates interpretable subgrouping/clustering of all instances according to their associated expert distributions. Extensive experiments on both real and synthetic datasets have demonstrated that the method is capable of obtaining promising clustering results and competitive time-to-event predicting performance.

IVAug 23, 2023

HNAS-reg: hierarchical neural architecture search for deformable medical image registration

Jiong Wu, Yong Fan

Convolutional neural networks (CNNs) have been widely used to build deep learning models for medical image registration, but manually designed network architectures are not necessarily optimal. This paper presents a hierarchical NAS framework (HNAS-Reg), consisting of both convolutional operation search and network topology search, to identify the optimal network architecture for deformable medical image registration. To mitigate the computational overhead and memory constraints, a partial channel strategy is utilized without losing optimization quality. Experiments on three datasets, consisting of 636 T1-weighted magnetic resonance images (MRIs), have demonstrated that the proposal method can build a deep learning model with improved image registration accuracy and reduced model size, compared with state-of-the-art image registration approaches, including one representative traditional approach and two unsupervised learning-based approaches.

IVNov 3, 2023

Medical Image Segmentation with Domain Adaptation: A Survey

Yuemeng Li, Yong Fan

Deep learning (DL) has shown remarkable success in various medical imaging data analysis applications. However, it remains challenging for DL models to achieve good generalization, especially when the training and testing datasets are collected at sites with different scanners, due to domain shift caused by differences in data distributions. Domain adaptation has emerged as an effective means to address this challenge by mitigating domain gaps in medical imaging applications. In this review, we specifically focus on domain adaptation approaches for DL-based medical image segmentation. We first present the motivation and background knowledge underlying domain adaptations, then provide a comprehensive review of domain adaptation applications in medical image segmentations, and finally discuss the challenges, limitations, and future research trends in the field to promote the methodology development of domain adaptation in the context of medical image segmentation. Our goal was to provide researchers with up-to-date references on the applications of domain adaptation in medical image segmentation studies.

CVFeb 18, 2025Code

GVTNet: Graph Vision Transformer For Face Super-Resolution

Chao Yang, Yong Fan, Cheng Lu et al.

Recent advances in face super-resolution research have utilized the Transformer architecture. This method processes the input image into a series of small patches. However, because of the strong correlation between different facial components in facial images. When it comes to super-resolution of low-resolution images, existing algorithms cannot handle the relationships between patches well, resulting in distorted facial components in the super-resolution results. To solve the problem, we propose a transformer architecture based on graph neural networks called graph vision transformer network. We treat each patch as a graph node and establish an adjacency matrix based on the information between patches. In this way, the patch only interacts between neighboring patches, further processing the relationship of facial components. Quantitative and visualization experiments have underscored the superiority of our algorithm over state-of-the-art techniques. Through detailed comparisons, we have demonstrated that our algorithm possesses more advanced super-resolution capabilities, particularly in enhancing facial components. The PyTorch code is available at https://github.com/continueyang/GVTNet

CVFeb 18, 2025Code

DeltaDiff: Reality-Driven Diffusion with AnchorResiduals for Faithful SR

Chao Yang, Yong Fan, Qichao Zhang et al.

Recently, the transfer application of diffusion models in super-resolu-tion tasks has faced the problem ofdecreased fidelity. Due to the inherent randomsampling characteristics ofdiffusion models, direct application in super-resolu-tion tasks can result in generated details deviating from the true distribution ofhigh-resolution images. To address this, we propose DeltaDiff, a novel frame.work that constrains the difusion process, its essence is to establish a determin-istic mapping path between HR and LR, rather than the random noise disturbanceprocess oftraditional difusion models. Theoretical analysis demonstrates a 25%reduction in diffusion entropy in the residual space compared to pixel-space diffiusion, effectively suppressing irrelevant noise interference. The experimentalresults show that our method surpasses state-of-the-art models and generates re-sults with better fidelity. This work establishes a new low-rank constrained par-adigm for applying diffusion models to image reconstruction tasks, balancingstochastic generation with structural fidelity. Our code and model are publiclyavailable at https://github.com/continueyang/DeltaDiff .

CVApr 11, 2024

CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models

Sheng Wang, Tianming Du, Katherine Fischer et al.

Computer-aided diagnosis systems hold great promise to aid radiologists and clinicians in radiological clinical practice and enhance diagnostic accuracy and efficiency. However, the conventional systems primarily focus on delivering diagnostic results through text report generation or medical image classification, positioning them as standalone decision-makers rather than helpers and ignoring radiologists' expertise. This study introduces an innovative paradigm to create an assistive co-pilot system for empowering radiologists by leveraging Large Language Models (LLMs) and medical image analysis tools. Specifically, we develop a collaborative framework to integrate LLMs and quantitative medical image analysis results generated by foundation models with radiologists in the loop, achieving efficient and safe generation of radiology reports and effective utilization of computational power of AI and the expertise of medical professionals. This approach empowers radiologists to generate more precise and detailed diagnostic reports, enhancing patient outcomes while reducing the burnout of clinicians. Our methodology underscores the potential of AI as a supportive tool in medical diagnostics, promoting a harmonious integration of technology and human expertise to advance the field of radiology.

NCOct 20, 2021

Multidimensional representations in late-life depression: convergence in neuroimaging, cognition, clinical symptomatology and genetics

Junhao Wen, Cynthia H. Y. Fu, Duygu Tosun et al.

Late-life depression (LLD) is characterized by considerable heterogeneity in clinical manifestation. Unraveling such heterogeneity would aid in elucidating etiological mechanisms and pave the road to precision and individualized medicine. We sought to delineate, cross-sectionally and longitudinally, disease-related heterogeneity in LLD linked to neuroanatomy, cognitive functioning, clinical symptomatology, and genetic profiles. Multimodal data from a multicentre sample (N=996) were analyzed. A semi-supervised clustering method (HYDRA) was applied to regional grey matter (GM) brain volumes to derive dimensional representations. Two dimensions were identified, which accounted for the LLD-related heterogeneity in voxel-wise GM maps, white matter (WM) fractional anisotropy (FA), neurocognitive functioning, clinical phenotype, and genetics. Dimension one (Dim1) demonstrated relatively preserved brain anatomy without WM disruptions relative to healthy controls. In contrast, dimension two (Dim2) showed widespread brain atrophy and WM integrity disruptions, along with cognitive impairment and higher depression severity. Moreover, one de novo independent genetic variant (rs13120336) was significantly associated with Dim 1 but not with Dim 2. Notably, the two dimensions demonstrated significant SNP-based heritability of 18-27% within the general population (N=12,518 in UKBB). Lastly, in a subset of individuals having longitudinal measurements, Dim2 demonstrated a more rapid longitudinal decrease in GM and brain age, and was more likely to progress to Alzheimers disease, compared to Dim1 (N=1,413 participants and 7,225 scans from ADNI, BLSA, and BIOCARD datasets).

LGSep 8, 2021

Disentangling Alzheimer's disease neurodegeneration from typical brain aging using machine learning

Gyujoon Hwang, Ahmed Abdulkadir, Guray Erus et al.

Neuroimaging biomarkers that distinguish between typical brain aging and Alzheimer's disease (AD) are valuable for determining how much each contributes to cognitive decline. Machine learning models can derive multi-variate brain change patterns related to the two processes, including the SPARE-AD (Spatial Patterns of Atrophy for Recognition of Alzheimer's Disease) and SPARE-BA (of Brain Aging) investigated herein. However, substantial overlap between brain regions affected in the two processes confounds measuring them independently. We present a methodology toward disentangling the two. T1-weighted MRI images of 4,054 participants (48-95 years) with AD, mild cognitive impairment (MCI), or cognitively normal (CN) diagnoses from the iSTAGING (Imaging-based coordinate SysTem for AGIng and NeurodeGenerative diseases) consortium were analyzed. First, a subset of AD patients and CN adults were selected based purely on clinical diagnoses to train SPARE-BA1 (regression of age using CN individuals) and SPARE-AD1 (classification of CN versus AD). Second, analogous groups were selected based on clinical and molecular markers to train SPARE-BA2 and SPARE-AD2: amyloid-positive (A+) AD continuum group (consisting of A+AD, A+MCI, and A+ and tau-positive CN individuals) and amyloid-negative (A-) CN group. Finally, the combined group of the AD continuum and A-/CN individuals was used to train SPARE-BA3, with the intention to estimate brain age regardless of AD-related brain changes. Disentangled SPARE models derived brain patterns that were more specific to the two types of the brain changes. Correlation between the SPARE-BA and SPARE-AD was significantly reduced. Correlation of disentangled SPARE-AD was non-inferior to the molecular measurements and to the number of APOE4 alleles, but was less to AD-related psychometric test scores, suggesting contribution of advanced brain aging to these scores.

LGFeb 26, 2021

GaNDLF: A Generally Nuanced Deep Learning Framework for Scalable End-to-End Clinical Workflows in Medical Imaging

Sarthak Pati, Siddhesh P. Thakur, İbrahim Ethem Hamamcı et al.

Deep Learning (DL) has the potential to optimize machine learning in both the scientific and clinical communities. However, greater expertise is required to develop DL algorithms, and the variability of implementations hinders their reproducibility, translation, and deployment. Here we present the community-driven Generally Nuanced Deep Learning Framework (GaNDLF), with the goal of lowering these barriers. GaNDLF makes the mechanism of DL development, training, and inference more stable, reproducible, interpretable, and scalable, without requiring an extensive technical background. GaNDLF aims to provide an end-to-end solution for all DL-related tasks in computational precision medicine. We demonstrate the ability of GaNDLF to analyze both radiology and histology images, with built-in support for k-fold cross-validation, data augmentation, multiple modalities and output classes. Our quantitative performance evaluation on numerous use cases, anatomies, and computational tasks supports GaNDLF as a robust application framework for deployment in clinical workflows.

CVDec 11, 2020

Unsupervised deep learning for individualized brain functional network identification

Hongming Li, Yong Fan

A novel unsupervised deep learning method is developed to identify individual-specific large scale brain functional networks (FNs) from resting-state fMRI (rsfMRI) in an end-to-end learning fashion. Our method leverages deep Encoder-Decoder networks and conventional brain decomposition models to identify individual-specific FNs in an unsupervised learning framework and facilitate fast inference for new individuals with one forward pass of the deep network. Particularly, convolutional neural networks (CNNs) with an Encoder-Decoder architecture are adopted to identify individual-specific FNs from rsfMRI data by optimizing their data fitting and sparsity regularization terms that are commonly used in brain decomposition models. Moreover, a time-invariant representation learning module is designed to learn features invariant to temporal orders of time points of rsfMRI data. The proposed method has been validated based on a large rsfMRI dataset and experimental results have demonstrated that our method could obtain individual-specific FNs which are consistent with well-established FNs and are informative for predicting brain age, indicating that the individual-specific FNs identified truly captured the underlying variability of individualized functional neuroanatomy.

CLNov 3, 2020

Meta-Learning for Natural Language Understanding under Continual Learning Framework

Jiacheng Wang, Yong Fan, Duo Jiang et al.

Neural network has been recognized with its accomplishments on tackling various natural language understanding (NLU) tasks. Methods have been developed to train a robust model to handle multiple tasks to gain a general representation of text. In this paper, we implement the model-agnostic meta-learning (MAML) and Online aware Meta-learning (OML) meta-objective under the continual framework for NLU tasks. We validate our methods on selected SuperGLUE and GLUE benchmark.

IVOct 11, 2020

Medical Image Harmonization Using Deep Learning Based Canonical Mapping: Toward Robust and Generalizable Learning in Imaging

Vishnu M. Bashyam, Jimit Doshi, Guray Erus et al.

Conventional and deep learning-based methods have shown great potential in the medical imaging domain, as means for deriving diagnostic, prognostic, and predictive biomarkers, and by contributing to precision medicine. However, these methods have yet to see widespread clinical adoption, in part due to limited generalization performance across various imaging devices, acquisition protocols, and patient populations. In this work, we propose a new paradigm in which data from a diverse range of acquisition conditions are "harmonized" to a common reference domain, where accurate model learning and prediction can take place. By learning an unsupervised image to image canonical mapping from diverse datasets to a reference domain using generative deep learning models, we aim to reduce confounding data variation while preserving semantic information, thereby rendering the learning task easier in the reference domain. We test this approach on two example problems, namely MRI-based brain age prediction and classification of schizophrenia, leveraging pooled cohorts of neuroimaging MRI data spanning 9 sites and 9701 subjects. Our results indicate a substantial improvement in these tasks in out-of-sample data, even when training is restricted to a single site.

CVOct 4, 2020

MDReg-Net: Multi-resolution diffeomorphic image registration using fully convolutional networks with deep self-supervision

Hongming Li, Yong Fan

We present a diffeomorphic image registration algorithm to learn spatial transformations between pairs of images to be registered using fully convolutional networks (FCNs) under a self-supervised learning setting. The network is trained to estimate diffeomorphic spatial transformations between pairs of images by maximizing an image-wise similarity metric between fixed and warped moving images, similar to conventional image registration algorithms. It is implemented in a multi-resolution image registration framework to optimize and learn spatial transformations at different image resolutions jointly and incrementally with deep self-supervision in order to better handle large deformation between images. A spatial Gaussian smoothing kernel is integrated with the FCNs to yield sufficiently smooth deformation fields to achieve diffeomorphic image registration. Particularly, spatial transformations learned at coarser resolutions are utilized to warp the moving image, which is subsequently used for learning incremental transformations at finer resolutions. This procedure proceeds recursively to the full image resolution and the accumulated transformations serve as the final transformation to warp the moving image at the finest resolution. Experimental results for registering high resolution 3D structural brain magnetic resonance (MR) images have demonstrated that image registration networks trained by our method obtain robust, diffeomorphic image registration results within seconds with improved accuracy compared with state-of-the-art image registration algorithms.

IVJun 2, 2020

Adaptive convolutional neural networks for k-space data interpolation in fast magnetic resonance imaging

Tianming Du, Honggang Zhang, Yuemeng Li et al.

Deep learning in k-space has demonstrated great potential for image reconstruction from undersampled k-space data in fast magnetic resonance imaging (MRI). However, existing deep learning-based image reconstruction methods typically apply weight-sharing convolutional neural networks (CNNs) to k-space data without taking into consideration the k-space data's spatial frequency properties, leading to ineffective learning of the image reconstruction models. Moreover, complementary information of spatially adjacent slices is often ignored in existing deep learning methods. To overcome such limitations, we develop a deep learning algorithm, referred to as adaptive convolutional neural networks for k-space data interpolation (ACNN-k-Space), which adopts a residual Encoder-Decoder network architecture to interpolate the undersampled k-space data by integrating spatially contiguous slices as multi-channel input, along with k-space data from multiple coils if available. The network is enhanced by self-attention layers to adaptively focus on k-space data at different spatial frequencies and channels. We have evaluated our method on two public datasets and compared it with state-of-the-art existing methods. Ablation studies and experimental results demonstrate that our method effectively reconstructs images from undersampled k-space data and achieves significantly better image reconstruction performance than current state-of-the-art techniques.

IVFeb 13, 2020

ACEnet: Anatomical Context-Encoding Network for Neuroanatomy Segmentation

Yuemeng Li, Hongming Li, Yong Fan

Segmentation of brain structures from magnetic resonance (MR) scans plays an important role in the quantification of brain morphology. Since 3D deep learning models suffer from high computational cost, 2D deep learning methods are favored for their computational efficiency. However, existing 2D deep learning methods are not equipped to effectively capture 3D spatial contextual information that is needed to achieve accurate brain structure segmentation. In order to overcome this limitation, we develop an Anatomical Context-Encoding Network (ACEnet) to incorporate 3D spatial and anatomical contexts in 2D convolutional neural networks (CNNs) for efficient and accurate segmentation of brain structures from MR scans, consisting of 1) an anatomical context encoding module to incorporate anatomical information in 2D CNNs and 2) a spatial context encoding module to integrate 3D image information in 2D CNNs. In addition, a skull stripping module is adopted to guide the 2D CNNs to attend to the brain. Extensive experiments on three benchmark datasets have demonstrated that our method achieves promising performance compared with state-of-the-art alternative methods for brain structure segmentation in terms of both computational efficiency and segmentation accuracy.

IVOct 23, 2019

Context-endcoding for neural network based skull stripping in magnetic resonance imaging

Zhen Liu, Borui Xiao, Yuemeng Li et al.

Skull stripping is usually the first step for most brain analysisprocess in magnetic resonance images. A lot of deep learn-ing neural network based methods have been developed toachieve higher accuracy. Since the 3D deep learning modelssuffer from high computational cost and are subject to GPUmemory limit challenge, a variety of 2D deep learning meth-ods have been developed. However, existing 2D deep learn-ing methods are not equipped to effectively capture 3D se-mantic information that is needed to achieve higher accuracy.In this paper, we propose a context-encoding method to em-power the 2D network to capture the 3D context information.For the context-encoding method, firstly we encode the 2Dfeatures of original 2D network, secondly we encode the sub-volume of 3D MRI images, finally we fuse the encoded 2Dfeatures and 3D features with semantic encoding classifica-tion loss. To get computational efficiency, although we en-code the sub-volume of 3D MRI images instead of buildinga 3D neural network, extensive experiments on three bench-mark Datasets demonstrate our method can achieve superioraccuracy to state-of-the-art alternative methods with the dicescore 99.6% on NFBS and 99.09 % on LPBA40 and 99.17 %on OASIS.

CVMay 7, 2019

Feature-Fused Context-Encoding Network for Neuroanatomy Segmentation

Yuemeng Li, Hangfan Liu, Hongming Li et al.

Automatic segmentation of fine-grained brain structures remains a challenging task. Current segmentation methods mainly utilize 2D and 3D deep neural networks. The 2D networks take image slices as input to produce coarse segmentation in less processing time, whereas the 3D networks take the whole image volumes to generated fine-detailed segmentation with more computational burden. In order to obtain accurate fine-grained segmentation efficiently, in this paper, we propose an end-to-end Feature-Fused Context-Encoding Network for brain structure segmentation from MR (magnetic resonance) images. Our model is implemented based on a 2D convolutional backbone, which integrates a 2D encoding module to acquire planar image features and a spatial encoding module to extract spatial context information. A global context encoding module is further introduced to capture global context semantics from the fused 2D encoding and spatial features. The proposed network aims to fully leverage the global anatomical prior knowledge learned from context semantics, which is represented by a structure-aware attention factor to recalibrate the outputs of the network. In this way, the network is guaranteed to be aware of the class-dependent feature maps to facilitate the segmentation. We evaluate our model on 2012 Brain Multi-Atlas Labelling Challenge dataset for 134 fine-grained structure segmentation. Besides, we validate our network on 27 coarse structure segmentation tasks. Experimental results have demonstrated that our model can achieve improved performance compared with the state-of-the-art approaches.

CVApr 15, 2019

A deep learning model for early prediction of Alzheimer's disease dementia based on hippocampal MRI

Hongming Li, Mohamad Habes, David A. Wolk et al.

Introduction: It is challenging at baseline to predict when and which individuals who meet criteria for mild cognitive impairment (MCI) will ultimately progress to Alzheimer's disease (AD) dementia. Methods: A deep learning method is developed and validated based on MRI scans of 2146 subjects (803 for training and 1343 for validation) to predict MCI subjects' progression to AD dementia in a time-to-event analysis setting. Results: The deep learning time-to-event model predicted individual subjects' progression to AD dementia with a concordance index (C-index) of 0.762 on 439 ADNI testing MCI subjects with follow-up duration from 6 to 78 months (quartiles: [24, 42, 54]) and a C-index of 0.781 on 40 AIBL testing MCI subjects with follow-up duration from 18-54 months (quartiles: [18, 36,54]). The predicted progression risk also clustered individual subjects into subgroups with significant differences in their progression time to AD dementia (p<0.0002). Improved performance for predicting progression to AD dementia (C-index=0.864) was obtained when the deep learning based progression risk was combined with baseline clinical measures. Conclusion: Our method provides a cost effective and accurate means for prognosis and potentially to facilitate enrollment in clinical trials with individuals likely to progress within a specific temporal period.

CVApr 6, 2019

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection

Yuemeng Li, Yong Fan

Pulmonary nodule detection plays an important role in lung cancer screening with low-dose computed tomography (CT) scans. It remains challenging to build nodule detection deep learning models with good generalization performance due to unbalanced positive and negative samples. In order to overcome this problem and further improve state-of-the-art nodule detection methods, we develop a novel deep 3D convolutional neural network with an Encoder-Decoder structure in conjunction with a region proposal network. Particularly, we utilize a dynamically scaled cross entropy loss to reduce the false positive rate and combat the sample imbalance problem associated with nodule detection. We adopt the squeeze-and-excitation structure to learn effective image features and utilize inter-dependency information of different feature maps. We have validated our method based on publicly available CT scans with manually labelled ground-truth obtained from LIDC/IDRI dataset and its subset LUNA16 with thinner slices. Ablation studies and experimental results have demonstrated that our method could outperform state-of-the-art nodule detection methods by a large margin.

CVJan 5, 2019

Fully-automatic segmentation of kidneys in clinical ultrasound images using a boundary distance regression network

Shi Yin, Zhengqiang Zhang, Hongming Li et al.

It remains challenging to automatically segment kidneys in clinical ultrasound images due to the kidneys' varied shapes and image intensity distributions, although semi-automatic methods have achieved promising performance. In this study, we developed a novel boundary distance regression deep neural network to segment the kidneys, informed by the fact that the kidney boundaries are relatively consistent across images in terms of their appearance. Particularly, we first use deep neural networks pre-trained for classification of natural images to extract high-level image features from ultrasound images, then these feature maps are used as input to learn kidney boundary distance maps using a boundary distance regression network, and finally the predicted boundary distance maps are classified as kidney pixels or non-kidney pixels using a pixel classification network in an end-to-end learning fashion. Experimental results have demonstrated that our method could effectively improve the performance of automatic kidney segmentation, significantly better than deep learning based pixel classification networks.

CVJan 5, 2019

Early Prediction of Alzheimer's Disease Dementia Based on Baseline Hippocampal MRI and 1-Year Follow-Up Cognitive Measures Using Deep Recurrent Neural Networks

Hongming Li, Yong Fan

Multi-modal biological, imaging, and neuropsychological markers have demonstrated promising performance for distinguishing Alzheimer's disease (AD) patients from cognitively normal elders. However, it remains difficult to early predict when and which mild cognitive impairment (MCI) individuals will convert to AD dementia. Informed by pattern classification studies which have demonstrated that pattern classifiers built on longitudinal data could achieve better classification performance than those built on cross-sectional data, we develop a deep learning model based on recurrent neural networks (RNNs) to learn informative representation and temporal dynamics of longitudinal cognitive measures of individual subjects and combine them with baseline hippocampal MRI for building a prognostic model of AD dementia progression. Experimental results on a large cohort of MCI subjects have demonstrated that the deep learning model could learn informative measures from longitudinal data for characterizing the progression of MCI subjects to AD dementia, and the prognostic model could early predict AD progression with high accuracy.

CVJan 5, 2019

Deep Convolutional Neural Networks for Imaging Data Based Survival Analysis of Rectal Cancer

Hongming Li, Pamela Boimel, James Janopaul-Naylor et al.

Recent radiomic studies have witnessed promising performance of deep learning techniques in learning radiomic features and fusing multimodal imaging data. Most existing deep learning based radiomic studies build predictive models in a setting of pattern classification, not appropriate for survival analysis studies where some data samples have incomplete observations. To improve existing survival analysis techniques whose performance is hinged on imaging features, we propose a deep learning method to build survival regression models by optimizing imaging features with deep convolutional neural networks (CNNs) in a proportional hazards model. To make the CNNs applicable to tumors with varied sizes, a spatial pyramid pooling strategy is adopted. Our method has been validated based on a simulated imaging dataset and a FDG-PET/CT dataset of rectal cancer patients treated for locally advanced rectal cancer. Compared with survival prediction models built upon hand-crafted radiomic features using Cox proportional hazards model and random survival forests, our method achieved competitive prediction performance.

CVNov 12, 2018

Automatic kidney segmentation in ultrasound images using subsequent boundary distance regression and pixelwise classification networks

Shi Yin, Qinmu Peng, Hongming Li et al.

It remains challenging to automatically segment kidneys in clinical ultrasound (US) images due to the kidneys' varied shapes and image intensity distributions, although semi-automatic methods have achieved promising performance. In this study, we propose subsequent boundary distance regression and pixel classification networks to segment the kidneys, informed by the fact that the kidney boundaries have relatively homogenous texture patterns across images. Particularly, we first use deep neural networks pre-trained for classification of natural images to extract high-level image features from US images, then these features are used as input to learn kidney boundary distance maps using a boundary distance regression network, and finally the predicted boundary distance maps are classified as kidney pixels or non-kidney pixels using a pixel classification network in an end-to-end learning fashion. We also adopted a data-augmentation method based on kidney shape registration to generate enriched training data from a small number of US images with manually segmented kidney labels. Experimental results have demonstrated that our method could effectively improve the performance of automatic kidney segmentation, significantly better than deep learning-based pixel classification networks.

CVNov 5, 2018

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Spyridon Bakas, Mauricio Reyes, Andras Jakab et al.

Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.

CVSep 14, 2018

Brain decoding from functional MRI using long short-term memory recurrent neural networks

Hongming Li, Yong Fan

Decoding brain functional states underlying different cognitive processes using multivariate pattern recognition techniques has attracted increasing interests in brain imaging studies. Promising performance has been achieved using brain functional connectivity or brain activation signatures for a variety of brain decoding tasks. However, most of existing studies have built decoding models upon features extracted from imaging data at individual time points or temporal windows with a fixed interval, which might not be optimal across different cognitive processes due to varying temporal durations and dependency of different cognitive processes. In this study, we develop a deep learning based framework for brain decoding by leveraging recent advances in sequence modeling using long short-term memory (LSTM) recurrent neural networks (RNNs). Particularly, functional profiles extracted from task functional imaging data based on their corresponding subject-specific intrinsic functional networks are used as features to build brain decoding models, and LSTM RNNs are adopted to learn decoding mappings between functional profiles and brain states. We evaluate the proposed method using task fMRI data from the HCP dataset, and experimental results have demonstrated that the proposed method could effectively distinguish brain states under different task events and obtain higher accuracy than conventional decoding models.

CVSep 14, 2018

Identification of temporal transition of functional states using recurrent neural networks from functional MRI

Hongming Li, Yong Fan

Dynamic functional connectivity analysis provides valuable information for understanding brain functional activity underlying different cognitive processes. Besides sliding window based approaches, a variety of methods have been developed to automatically split the entire functional MRI scan into segments by detecting change points of functional signals to facilitate better characterization of temporally dynamic functional connectivity patterns. However, these methods are based on certain assumptions for the functional signals, such as Gaussian distribution, which are not necessarily suitable for the fMRI data. In this study, we develop a deep learning based framework for adaptively detecting temporally dynamic functional state transitions in a data-driven way without any explicit modeling assumptions, by leveraging recent advances in recurrent neural networks (RNNs) for sequence modeling. Particularly, we solve this problem in an anomaly detection framework with an assumption that the functional profile of one single time point could be reliably predicted based on its preceding profiles within stable functional state, while large prediction errors would occur around change points of functional states. We evaluate the proposed method using both task and resting-state fMRI data obtained from the human connectome project and experimental results have demonstrated that the proposed change point detection method could effectively identify change points between different task events and split the resting-state fMRI into segments with distinct functional connectivity patterns.

CVSep 14, 2018

Identification of multi-scale hierarchical brain functional networks using deep matrix factorization

Hongming Li, Xiaofeng Zhu, Yong Fan

We present a deep semi-nonnegative matrix factorization method for identifying subject-specific functional networks (FNs) at multiple spatial scales with a hierarchical organization from resting state fMRI data. Our method is built upon a deep semi-nonnegative matrix factorization framework to jointly detect the FNs at multiple scales with a hierarchical organization, enhanced by group sparsity regularization that helps identify subject-specific FNs without loss of inter-subject comparability. The proposed method has been validated for predicting subject-specific functional activations based on functional connectivity measures of the hierarchical multi-scale FNs of the same subjects. Experimental results have demonstrated that our method could obtain subject-specific multi-scale hierarchical FNs and their functional connectivity measures across different scales could better predict subject-specific functional activations than those obtained by alternative techniques.

CVJan 11, 2018

Brain Age Prediction Based on Resting-State Functional Connectivity Patterns Using Convolutional Neural Networks

Hongming Li, Theodore D. Satterthwaite, Yong Fan

Brain age prediction based on neuroimaging data could help characterize both the typical brain development and neuropsychiatric disorders. Pattern recognition models built upon functional connectivity (FC) measures derived from resting state fMRI (rsfMRI) data have been successfully used to predict the brain age. However, most existing studies focus on coarse-grained FC measures between brain regions or intrinsic connectivity networks (ICNs), which may sacrifice fine-grained FC information of the rsfMRI data. Whole brain voxel-wise FC measures could provide fine-grained FC information of the brain and may improve the prediction performance. In this study, we develop a deep learning method to use convolutional neural networks (CNNs) to learn informative features from the fine-grained whole brain FC measures for the brain age prediction. Experimental results on a large dataset of resting-state fMRI demonstrate that the deep learning model with fine-grained FC measures could better predict the brain age.

CVJan 11, 2018

Non-Rigid Image Registration Using Self-Supervised Fully Convolutional Networks without Training Data

Hongming Li, Yong Fan

A novel non-rigid image registration algorithm is built upon fully convolutional networks (FCNs) to optimize and learn spatial transformations between pairs of images to be registered in a self-supervised learning framework. Different from most existing deep learning based image registration methods that learn spatial transformations from training data with known corresponding spatial transformations, our method directly estimates spatial transformations between pairs of images by maximizing an image-wise similarity metric between fixed and deformed moving images, similar to conventional image registration algorithms. The image registration is implemented in a multi-resolution image registration framework to jointly optimize and learn spatial transformations and FCNs at different spatial resolutions with deep self-supervision through typical feedforward and backpropagation computation. The proposed method has been evaluated for registering 3D structural brain magnetic resonance (MR) images and obtained better performance than state-of-the-art image registration algorithms.

CVDec 31, 2017

Transfer learning for diagnosis of congenital abnormalities of the kidney and urinary tract in children based on Ultrasound imaging data

Qiang Zheng, Gregory Tasian, Yong Fan

Classification of ultrasound (US) kidney images for diagnosis of congenital abnormalities of the kidney and urinary tract (CAKUT) in children is a challenging task. It is desirable to improve existing pattern classification models that are built upon conventional image features. In this study, we propose a transfer learning-based method to extract imaging features from US kidney images in order to improve the CAKUT diagnosis in children. Particularly, a pre-trained deep learning model (imagenet-caffe-alex) is adopted for transfer learning-based feature extraction from 3-channel feature maps computed from US images, including original images, gradient features, and distanced transform features. Support vector machine classifiers are then built upon different sets of features, including the transfer learning features, conventional imaging features, and their combination. Experimental results have demonstrated that the combination of transfer learning features and conventional imaging features yielded the best classification performance for distinguishing CAKUT patients from normal controls based on their US kidney images.

CVDec 31, 2017

Integrating semi-supervised label propagation and random forests for multi-atlas based hippocampus segmentation

Qiang Zheng, Yong Fan

A novel multi-atlas based image segmentation method is proposed by integrating a semi-supervised label propagation method and a supervised random forests method in a pattern recognition based label fusion framework. The semi-supervised label propagation method takes into consideration local and global image appearance of images to be segmented and segments the images by propagating reliable segmentation results obtained by the supervised random forests method. Particularly, the random forests method is used to train a regression model based on image patches of atlas images for each voxel of the images to be segmented. The regression model is used to obtain reliable segmentation results to guide the label propagation for the segmentation. The proposed method has been compared with state-of-the-art multi-atlas based image segmentation methods for segmenting the hippocampus in MR images. The experiment results have demonstrated that our method obtained superior segmentation performance.

CVSep 5, 2017

Deep Ordinal Ranking for Multi-Category Diagnosis of Alzheimer's Disease using Hippocampal MRI data

Hongming Li, Mohamad Habes, Yong Fan

Increasing effort in brain image analysis has been dedicated to early diagnosis of Alzheimer's disease (AD) based on neuroimaging data. Most existing studies have been focusing on binary classification problems, e.g., distinguishing AD patients from normal control (NC) elderly or mild cognitive impairment (MCI) individuals from NC elderly. However, identifying individuals with AD and MCI, especially MCI individuals who will convert to AD (progressive MCI, pMCI), in a single setting, is needed to achieve the goal of early diagnosis of AD. In this paper, we propose a deep ordinal ranking model for distinguishing NC, stable MCI (sMCI), pMCI, and AD at an individual subject level, taking into account the inherent ordinal severity of brain degeneration caused by normal aging, MCI, and AD, rather than formulating the classification as a multi-category classification problem. The proposed deep ordinal ranking model focuses on the hippocampal morphology of individuals and learns informative and discriminative features automatically. Experiment results based on a large cohort of individuals from the Alzheimer's Disease Neuroimaging Initiative (ADNI) indicate that the proposed method can achieve better performance than traditional multi-category classification techniques using shape and radiomics features from structural magnetic resonance imaging (MRI) data.

CVSep 4, 2017

Non-rigid image registration using fully convolutional networks with deep self-supervision

Hongming Li, Yong Fan

We propose a novel non-rigid image registration algorithm that is built upon fully convolutional networks (FCNs) to optimize and learn spatial transformations between pairs of images to be registered. Different from most existing deep learning based image registration methods that learn spatial transformations from training data with known corresponding spatial transformations, our method directly estimates spatial transformations between pairs of images by maximizing an image-wise similarity metric between fixed and deformed moving images, similar to conventional image registration algorithms. At the same time, our method also learns FCNs for encoding the spatial transformations at the same spatial resolution of images to be registered, rather than learning coarse-grained spatial transformation information. The image registration is implemented in a multi-resolution image registration framework to jointly optimize and learn spatial transformations and FCNs at different resolutions with deep self-supervision through typical feedforward and backpropagation computation. Since our method simultaneously optimizes and learns spatial transformations for the image registration, our method can be directly used to register a pair of images, and the registration of a set of images is also a training procedure for FCNs so that the trained FCNs can be directly adopted to register new images by feedforward computation of the learned FCNs without any optimization. The proposed method has been evaluated for registering 3D structural brain magnetic resonance (MR) images and obtained better performance than state-of-the-art image registration algorithms.

CVJun 11, 2017

A dynamic graph-cuts method with integrated multiple feature maps for segmenting kidneys in ultrasound images

Qiang Zheng, Steven Warner, Gregory Tasian et al.

Purpose: To improve kidney segmentation in clinical ultrasound (US) images, we develop a new graph cuts based method to segment kidney US images by integrating original image intensity information and texture feature maps extracted using Gabor filters. Methods: To handle large appearance variation within kidney images and improve computational efficiency, we build a graph of image pixels close to kidney boundary instead of building a graph of the whole image. To make the kidney segmentation robust to weak boundaries, we adopt localized regional information to measure similarity between image pixels for computing edge weights to build the graph of image pixels. The localized graph is dynamically updated and the GC based segmentation iteratively progresses until convergence. The proposed method has been evaluated and compared with state of the art image segmentation methods based on clinical kidney US images of 85 subjects. We randomly selected US images of 20 subjects as training data for tuning the parameters, and validated the methods based on US images of the remaining 65 subjects. The segmentation results have been quantitatively analyzed using 3 metrics, including Dice Index, Jaccard Index, and Mean Distance. Results: Experiment results demonstrated that the proposed method obtained segmentation results for bilateral kidneys of 65 subjects with average Dice index of 0.9581, Jaccard index of 0.9204, and Mean Distance of 1.7166, better than other methods under comparison (p<10-19, paired Wilcoxon rank sum tests). Conclusions: The proposed method achieved promising performance for segmenting kidneys in US images, better than segmentation methods that built on any single channel of image information. This method will facilitate extraction of kidney characteristics that may predict important clinical outcomes such progression chronic kidney disease.

CVFeb 15, 2017

A deep learning model integrating FCNNs and CRFs for brain tumor segmentation

Xiaomei Zhao, Yihong Wu, Guidong Song et al.

Accurate and reliable brain tumor segmentation is a critical component in cancer diagnosis, treatment planning, and treatment outcome evaluation. Build upon successful deep learning techniques, a novel brain tumor segmentation method is developed by integrating fully convolutional neural networks (FCNNs) and Conditional Random Fields (CRFs) in a unified framework to obtain segmentation results with appearance and spatial consistency. We train a deep learning based segmentation model using 2D image patches and image slices in following steps: 1) training FCNNs using image patches; 2) training CRFs as Recurrent Neural Networks (CRF-RNN) using image slices with parameters of FCNNs fixed; and 3) fine-tuning the FCNNs and the CRF-RNN using image slices. Particularly, we train 3 segmentation models using 2D image patches and slices obtained in axial, coronal and sagittal views respectively, and combine them to segment brain tumors using a voting based fusion strategy. Our method could segment brain images slice-by-slice, much faster than those based on image patches. We have evaluated our method based on imaging data provided by the Multimodal Brain Tumor Image Segmentation Challenge (BRATS) 2013, BRATS 2015 and BRATS 2016. The experimental results have demonstrated that our method could build a segmentation model with Flair, T1c, and T2 scans and achieve competitive performance as those built with Flair, T1, T1c, and T2 scans.

LGApr 2, 2015

Direct l_(2,p)-Norm Learning for Feature Selection

Hanyang Peng, Yong Fan

In this paper, we propose a novel sparse learning based feature selection method that directly optimizes a large margin linear classification model sparsity with l_(2,p)-norm (0 < p < 1)subject to data-fitting constraints, rather than using the sparsity as a regularization term. To solve the direct sparsity optimization problem that is non-smooth and non-convex when 0<p<1, we provide an efficient iterative algorithm with proved convergence by converting it to a convex and smooth optimization problem at every iteration step. The proposed algorithm has been evaluated based on publicly available datasets, and extensive comparison experiments have demonstrated that our algorithm could achieve feature selection performance competitive to state-of-the-art algorithms.