Junfeng Zhang

CV
h-index21
11papers
600citations
Novelty46%
AI Score49

11 Papers

DGFeb 6, 2019
A Physics-Based Estimation of Mean Curvature Normal Vector for Triangulated Surfaces

Sudip Kumar Das, Mirza Cenanovic, Junfeng Zhang

In this note, we derive an approximation for the mean curvature normal vector on vertices of triangulated surface meshes from the Young-Laplace equation and the force balance principle. We then demonstrate that the approximation expression from our physics-based derivation is equivalent to the discrete Laplace-Beltrami operator approach in the literature. This work, in addition to providing an alternative expression to calculate the mean curvature normal vector, can be further extended to other mesh structures, including non-triangular and heterogeneous meshes.

AIAug 21, 2024
One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

Kelei He, Tiejun Dong, Jinhui Wu et al.

Understanding the structure of the protein-ligand complex is crucial to drug development. Existing virtual structure measurement and screening methods are dominated by docking and its derived methods combined with deep learning. However, the sampling and scoring methodology have largely restricted the accuracy and efficiency. Here, we show that these two fundamental tasks can be accurately tackled with a single model, namely LigPose, based on multi-task geometric deep learning. By representing the ligand and the protein pair as a graph, LigPose directly optimizes the three-dimensional structure of the complex, with the learning of binding strength and atomic interactions as auxiliary tasks, enabling its one-step prediction ability without docking tools. Extensive experiments show LigPose achieved state-of-the-art performance on major tasks in drug research. Its considerable improvements indicate a promising paradigm of AI-based pipeline for drug development.

70.4CGMar 24
Simple but not Simpler: A Surface-Sliding Method for Finding the Minimum Distance between Two Ellipsoids

Dariush Amirkhani, Junfeng Zhang

We propose a novel iterative process to establish the minimum separation between two ellipsoids. The method maintains one point on each surface and updates their locations in the theta-phi parametric space. The tension along the connecting segment between the two surface points serves as the guidance for the sliding direction, and the distance between them decreases gradually. The minimum distance is established when the connecting segment becomes perpendicular to the ellipsoid surfaces, at which point the net effect of the segment tension disappears and the surface points no longer move. Demonstration examples are carefully designed, and excellent numerical performance is observed, including accuracy, consistency, stability, and robustness. Furthermore, compared to other existing techniques, this surface-sliding approach has several attractive features, such as clear geometric representation, concise formulation, a simple algorithm, and the potential to be extended straightforwardly to other situations. This method is expected to be useful for future studies in computer graphics, engineering design, material modeling, and scientific simulations.

CVMay 17, 2021Code
Cross-Modality Brain Tumor Segmentation via Bidirectional Global-to-Local Unsupervised Domain Adaptation

Kelei He, Wen Ji, Tao Zhou et al.

Accurate segmentation of brain tumors from multi-modal Magnetic Resonance (MR) images is essential in brain tumor diagnosis and treatment. However, due to the existence of domain shifts among different modalities, the performance of networks decreases dramatically when training on one modality and performing on another, e.g., train on T1 image while performing on T2 image, which is often required in clinical applications. This also prohibits a network from being trained on labeled data and then transferred to unlabeled data from a different domain. To overcome this, unsupervised domain adaptation (UDA) methods provide effective solutions to alleviate the domain shift between labeled source data and unlabeled target data. In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme. Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor using the intermediate data distributions generated for the two domains, which includes an image-to-image translator and a shared-weighted segmentation network. Further, a global-to-local consistency learning module is proposed to build robust representation alignments in an integrated way. Extensive experiments on a multi-modal brain MR benchmark dataset demonstrate that the proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin, while a comprehensive ablation study validates the effectiveness of each key component. The implementation code of our method will be released at \url{https://github.com/KeleiHe/BiGL}.

28.1GRApr 27
UVTran: Accurate Hole-Filling Parameterization with Transformers

JunFeng Zhang

In industrial design, N-sided hole filling is typically formulated as the construction of a single trimmed B-spline surface by minimizing a fairness energy subject to geometric boundary constraints. This formulation requires an accurate parameter-space representation of the trimming curve on the filling surface. Most existing methods project the hole boundary onto a nearby plane or polygon to establish correspondence; however, they often neglect boundary heterogeneity, which can yield biased mappings, degrade fairness, and even cause filling failures. We propose UVTran, a transformer-based framework that predicts an auxiliary projection surface better to capture the geometric characteristics of the hole boundary. Exploiting B-spline locality, we design a cross-attention mechanism that biases each surface control point toward the nearby hole boundary, preserving local geometric detail. We voxelize control-point coordinates and formulate the fitting problem as a classification task, which reduces the model's sensitivity to small numerical perturbations and noise. We adopt a progressive-resolution training strategy that injects controlled discretization errors at coarse resolutions to mimic distribution shifts, thereby mitigating overfitting and improving generalization at high resolution. On our benchmark, UVTran outperforms both industrial and academic baselines: the tolerance-satisfaction rate improves by $12\%$, and it consistently produces fair filled surfaces even under complex hole boundary conditions. These results suggest that UVTran yields more faithful correspondences and fairer trimmed surfaces across a wide range of N-sided holes.

CVSep 21, 2025
A Cross-Hierarchical Multi-Feature Fusion Network Based on Multiscale Encoder-Decoder for Hyperspectral Change Detection

Mingshuai Sheng, Bhatti Uzair Aslam, Junfeng Zhang et al.

Hyperspectral change detection (HCD) aims to accurately identify land-cover changes in hyperspectral images of the same area acquired at different times, with key applications in environmental monitoring and disaster assessment. To address limitations of existing methods, such as insufficient use of multiscale features and low efficiency in differential feature fusion, this paper proposes a cross-hierarchical multi-feature fusion network (CHMFFN) based on a multiscale encoder-decoder architecture. The front-end adopts a multiscale feature extraction subnetwork, built on an encoder-decoder backbone with residual connections and a dual-core channel-spatial attention (DCCSA) module to extract spectral-spatial-temporal features (SSTF). The encoder captures multiscale features from shallow details to deep semantics via residual blocks and convolutional kernels with varying receptive fields. The decoder restores spatial resolution and suppresses noise information through skip connections integrating encoder features. Additionally, a spectral-temporal change feature learning (STCFL) module learns cross-temporal change features at different levels, strengthening inter-temporal difference capture. An adaptive fusion of advanced features (AFAF) module dynamically balances hierarchical differential features via adaptive weights, enhancing representation of complex changes. Experiments on four public hyperspectral datasets show CHMFFN outperforms state-of-the-art methods, verifying its effectiveness.

CVFeb 24, 2022
Transformers in Medical Image Analysis: A Review

Kelei He, Chen Gan, Zhuoyuan Li et al.

Transformers have dominated the field of natural language processing, and recently impacted the computer vision area. In the field of medical image analysis, Transformers have also been successfully applied to full-stack clinical applications, including image synthesis/reconstruction, registration, segmentation, detection, and diagnosis. Our paper aims to promote awareness and application of Transformers in the field of medical image analysis. Specifically, we first overview the core concepts of the attention mechanism built into Transformers and other basic components. Second, we review various Transformer architectures tailored for medical image applications and discuss their limitations. Within this review, we investigate key challenges revolving around the use of Transformers in different learning paradigms, improving the model efficiency, and their coupling with other techniques. We hope this review can give a comprehensive picture of Transformers to the readers in the field of medical image analysis.

IVMay 21, 2020
HF-UNet: Learning Hierarchically Inter-Task Relevance in Multi-Task U-Net for Accurate Prostate Segmentation

Kelei He, Chunfeng Lian, Bing Zhang et al.

Accurate segmentation of the prostate is a key step in external beam radiation therapy treatments. In this paper, we tackle the challenging task of prostate segmentation in CT images by a two-stage network with 1) the first stage to fast localize, and 2) the second stage to accurately segment the prostate. To precisely segment the prostate in the second stage, we formulate prostate segmentation into a multi-task learning framework, which includes a main task to segment the prostate, and an auxiliary task to delineate the prostate boundary. Here, the second task is applied to provide additional guidance of unclear prostate boundary in CT images. Besides, the conventional multi-task deep networks typically share most of the parameters (i.e., feature representations) across all tasks, which may limit their data fitting ability, as the specificities of different tasks are inevitably ignored. By contrast, we solve them by a hierarchically-fused U-Net structure, namely HF-UNet. The HF-UNet has two complementary branches for two tasks, with the novel proposed attention-based task consistency learning block to communicate at each level between the two decoding branches. Therefore, HF-UNet endows the ability to learn hierarchically the shared representations for different tasks, and preserve the specificities of learned representations for different tasks simultaneously. We did extensive evaluations of the proposed method on a large planning CT image dataset, including images acquired from 339 patients. The experimental results show HF-UNet outperforms the conventional multi-task network architectures and the state-of-the-art methods.

IVMay 15, 2020
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling

Kelei He, Chunfeng Lian, Ehsan Adeli et al.

Fully convolutional networks (FCNs), including UNet and VNet, are widely-used network architectures for semantic segmentation in recent studies. However, conventional FCN is typically trained by the cross-entropy or Dice loss, which only calculates the error between predictions and ground-truth labels for pixels individually. This often results in non-smooth neighborhoods in the predicted segmentation. To address this problem, we propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate by a multi-task UNet architecture. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Therefore, the proposed network has a dual-branch architecture that tackles two tasks: 1) a segmentation sub-network aiming to generate the prostate segmentation, and 2) a voxel-metric learning sub-network aiming to improve the quality of the learned feature space supervised by a metric loss. Specifically, the voxel-metric learning sub-network samples tuples (including triplets and pairs) in voxel-level through the intermediate feature maps. Unlike conventional deep metric learning methods that generate triplets or pairs in image-level before the training phase, our proposed voxel-wise tuples are sampled in an online manner and operated in an end-to-end fashion via multi-task learning. To evaluate the proposed method, we implement extensive experiments on a real CT image dataset consisting of 339 patients. The ablation studies show that our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss. And the comparisons show that the proposed method outperforms the state-of-the-art methods by a reasonable margin.

IVMay 8, 2020
Synergistic Learning of Lung Lobe Segmentation and Hierarchical Multi-Instance Classification for Automated Severity Assessment of COVID-19 in CT Images

Kelei He, Wei Zhao, Xingzhi Xie et al.

Understanding chest CT imaging of the coronavirus disease 2019 (COVID-19) will help detect infections early and assess the disease progression. Especially, automated severity assessment of COVID-19 in CT images plays an essential role in identifying cases that are in great need of intensive clinical care. However, it is often challenging to accurately assess the severity of this disease in CT images, due to variable infection regions in the lungs, similar imaging biomarkers, and large inter-case variations. To this end, we propose a synergistic learning framework for automated severity assessment of COVID-19 in 3D CT images, by jointly performing lung lobe segmentation and multi-instance classification. Considering that only a few infection regions in a CT image are related to the severity assessment, we first represent each input image by a bag that contains a set of 2D image patches (with each cropped from a specific slice). A multi-task multi-instance deep network (called M$^2$UNet) is then developed to assess the severity of COVID-19 patients and also segment the lung lobe simultaneously. Our M$^2$UNet consists of a patch-level encoder, a segmentation sub-network for lung lobe segmentation, and a classification sub-network for severity assessment (with a unique hierarchical multi-instance learning strategy). Here, the context information provided by segmentation can be implicitly employed to improve the performance of severity assessment. Extensive experiments were performed on a real COVID-19 CT image dataset consisting of 666 chest CT images, with results suggesting the effectiveness of our proposed method compared to several state-of-the-art methods.

MED-PHJul 30, 2017
LEARN: Learned Experts' Assessment-based Reconstruction Network for Sparse-data CT

Hu Chen, Yi Zhang, Yunjin Chen et al.

Compressive sensing (CS) has proved effective for tomographic reconstruction from sparsely collected data or under-sampled measurements, which are practically important for few-view CT, tomosynthesis, interior tomography, and so on. To perform sparse-data CT, the iterative reconstruction commonly use regularizers in the CS framework. Currently, how to choose the parameters adaptively for regularization is a major open problem. In this paper, inspired by the idea of machine learning especially deep learning, we unfold a state-of-the-art "fields of experts" based iterative reconstruction scheme up to a number of iterations for data-driven training, construct a Learned Experts' Assessment-based Reconstruction Network ("LEARN") for sparse-data CT, and demonstrate the feasibility and merits of our LEARN network. The experimental results with our proposed LEARN network produces a competitive performance with the well-known Mayo Clinic Low-Dose Challenge Dataset relative to several state-of-the-art methods, in terms of artifact reduction, feature preservation, and computational speed. This is consistent to our insight that because all the regularization terms and parameters used in the iterative reconstruction are now learned from the training data, our LEARN network utilizes application-oriented knowledge more effectively and recovers underlying images more favorably than competing algorithms. Also, the number of layers in the LEARN network is only 12, reducing the computational complexity of typical iterative algorithms by orders of magnitude.