CVJan 1, 2023Code
Curvature regularization for Non-line-of-sight Imaging from Under-sampled DataRui Ding, Juntian Ye, Qifeng Gao et al.
Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of which is highly possibility to be degraded due to noises and distortions. In this paper, we propose novel NLOS reconstruction models based on curvature regularization, i.e., the object-domain curvature regularization model and the dual (signal and object)-domain curvature regularization model. In what follows, we develop efficient optimization algorithms relying on the alternating direction method of multipliers (ADMM) with the backtracking stepsize rule, for which all solvers can be implemented on GPUs. We evaluate the proposed algorithms on both synthetic and real datasets, which achieve state-of-the-art performance, especially in the compressed sensing setting. Based on GPU computing, our algorithm is the most effective among iterative methods, balancing reconstruction quality and computational time. All our codes and data are available at https://github.com/Duanlab123/CurvNLOS.
IVApr 17, 2022
Fast Multi-grid Methods for Minimizing Curvature EnergyZhenwei Zhang, Ke Chen, Ke Tang et al.
The geometric high-order regularization methods such as mean curvature and Gaussian curvature, have been intensively studied during the last decades due to their abilities in preserving geometric properties including image edges, corners, and contrast. However, the dilemma between restoration quality and computational efficiency is an essential roadblock for high-order methods. In this paper, we propose fast multi-grid algorithms for minimizing both mean curvature and Gaussian curvature energy functionals without sacrificing accuracy for efficiency. Unlike the existing approaches based on operator splitting and the Augmented Lagrangian method (ALM), no artificial parameters are introduced in our formulation, which guarantees the robustness of the proposed algorithm. Meanwhile, we adopt the domain decomposition method to promote parallel computing and use the fine-to-coarse structure to accelerate convergence. Numerical experiments are presented on image denoising, CT, and MRI reconstruction problems to demonstrate the superiority of our method in preserving geometric structures and fine details. The proposed method is also shown effective in dealing with large-scale image processing problems by recovering an image of size $1024\times 1024$ within $40$s, while the ALM method requires around $200$s.
CVMar 12, 2023Code
MetaUE: Model-based Meta-learning for Underwater Image EnhancementZhenwei Zhang, Haorui Yan, Ke Tang et al.
The challenges in recovering underwater images are the presence of diverse degradation factors and the lack of ground truth images. Although synthetic underwater image pairs can be used to overcome the problem of inadequately observing data, it may result in over-fitting and enhancement degradation. This paper proposes a model-based deep learning method for restoring clean images under various underwater scenarios, which exhibits good interpretability and generalization ability. More specifically, we build up a multi-variable convolutional neural network model to estimate the clean image, background light and transmission map, respectively. An efficient loss function is also designed to closely integrate the variables based on the underwater image model. The meta-learning strategy is used to obtain a pre-trained model on the synthetic underwater dataset, which contains different types of degradation to cover the various underwater environments. The pre-trained model is then fine-tuned on real underwater datasets to obtain a reliable underwater image enhancement model, called MetaUE. Numerical experiments demonstrate that the pre-trained model has good generalization ability, allowing it to remove the color degradation for various underwater attenuation images such as blue, green and yellow, etc. The fine-tuning makes the model able to adapt to different underwater datasets, the enhancement results of which outperform the state-of-the-art underwater image restoration methods. All our codes and data are available at \url{https://github.com/Duanlab123/MetaUE}.
IVNov 14, 2022Code
CurvPnP: Plug-and-play Blind Image Restoration with Deep Curvature DenoiserYutong Li, Yuping Duan
Due to the development of deep learning-based denoisers, the plug-and-play strategy has achieved great success in image restoration problems. However, existing plug-and-play image restoration methods are designed for non-blind Gaussian denoising such as zhang et al (2022), the performance of which visibly deteriorate for unknown noises. To push the limits of plug-and-play image restoration, we propose a novel framework with blind Gaussian prior, which can deal with more complicated image restoration problems in the real world. More specifically, we build up a new image restoration model by regarding the noise level as a variable, which is implemented by a two-stage blind Gaussian denoiser consisting of a noise estimation subnetwork and a denoising subnetwork, where the noise estimation subnetwork provides the noise level to the denoising subnetwork for blind noise removal. We also introduce the curvature map into the encoder-decoder architecture and the supervised attention module to achieve a highly flexible and effective convolutional neural network. The experimental results on image denoising, deblurring and single-image super-resolution are provided to demonstrate the advantages of our deep curvature denoiser and the resulting plug-and-play blind image restoration method over the state-of-the-art model-based and learning-based methods. Our model is shown to be able to recover the fine image details and tiny structures even when the noise level is unknown for different image restoration tasks. The source codes are available at https://github.com/Duanlab123/CurvPnP.
IVJul 30, 2022
LRIP-Net: Low-Resolution Image Prior based Network for Limited-Angle CT ReconstructionQifeng Gao, Rui Ding, Linyuan Wang et al.
In the practical applications of computed tomography imaging, the projection data may be acquired within a limited-angle range and corrupted by noises due to the limitation of scanning conditions. The noisy incomplete projection data results in the ill-posedness of the inverse problems. In this work, we theoretically verify that the low-resolution reconstruction problem has better numerical stability than the high-resolution problem. In what follows, a novel low-resolution image prior based CT reconstruction model is proposed to make use of the low-resolution image to improve the reconstruction quality. More specifically, we build up a low-resolution reconstruction problem on the down-sampled projection data, and use the reconstructed low-resolution image as prior knowledge for the original limited-angle CT problem. We solve the constrained minimization problem by the alternating direction method with all subproblems approximated by the convolutional neural networks. Numerical experiments demonstrate that our double-resolution network outperforms both the variational method and popular learning-based reconstruction methods on noisy limited-angle reconstruction problems.
CVJul 17, 2023
A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical ImagesYan-Jie Zhou, Wei Liu, Yuan Gao et al.
Skin diseases are among the most prevalent health issues, and accurate computer-aided diagnosis methods are of importance for both dermatologists and patients. However, most of the existing methods overlook the essential domain knowledge required for skin disease diagnosis. A novel multi-task model, namely DermImitFormer, is proposed to fill this gap by imitating dermatologists' diagnostic procedures and strategies. Through multi-task learning, the model simultaneously predicts body parts and lesion attributes in addition to the disease itself, enhancing diagnosis accuracy and improving diagnosis interpretability. The designed lesion selection module mimics dermatologists' zoom-in action, effectively highlighting the local lesion features from noisy backgrounds. Additionally, the presented cross-interaction module explicitly models the complicated diagnostic reasoning between body parts, lesion attributes, and diseases. To provide a more robust evaluation of the proposed method, a large-scale clinical image dataset of skin diseases with significantly more cases than existing datasets has been established. Extensive experiments on three different datasets consistently demonstrate the state-of-the-art recognition performance of the proposed approach.
CVApr 30Code
Continuous-tone Simple Points: An $\ell_0$-Norm of Cyclic Gradient for Topology-Preserving Data-Driven Image SegmentationWenxiao Li, Faqiang Wang, Yuping Duan et al.
Topological features play an essential role in ensuring geometric plausibility and structural consistency in image analysis tasks such as segmentation and skeletonization. However, integrating topology-preserving learning based on simple points into deep learning tasks remains challenging, as existing simple point detection methods are confined to binary images and are non-differentiable, rendering them incompatible with gradient-based optimization in modern deep learning. Moreover, morphological and purely data-driven approaches often fail to guaranty topological consistency. To address these limitations, we propose a novel method that directly computes simple points on continuous-valued images, enabling differentiable topological inference. Building on this theory, we develop an efficient skeleton extraction algorithm that preserves topological structures in binary and continuous-valued images. Furthermore, we design a variational model that enforces topological constraints by preserving topologically non-removable (i.e., non-simple) points, which can be seamlessly integrated into any deep neural network segmentation with softmax or sigmoid outputs. Experimental results demonstrate that the proposed approach effectively improves topological integrity and structural accuracy across multiple benchmarks. The codes are available in https://github.com/levnsio/CSP.
IVJan 27, 2024Code
ParaTransCNN: Parallelized TransCNN Encoder for Medical Image SegmentationHongkun Sun, Jing Xu, Yuping Duan
The convolutional neural network-based methods have become more and more popular for medical image segmentation due to their outstanding performance. However, they struggle with capturing long-range dependencies, which are essential for accurately modeling global contextual correlations. Thanks to the ability to model long-range dependencies by expanding the receptive field, the transformer-based methods have gained prominence. Inspired by this, we propose an advanced 2D feature extraction method by combining the convolutional neural network and Transformer architectures. More specifically, we introduce a parallelized encoder structure, where one branch uses ResNet to extract local information from images, while the other branch uses Transformer to extract global information. Furthermore, we integrate pyramid structures into the Transformer to extract global information at varying resolutions, especially in intensive prediction tasks. To efficiently utilize the different information in the parallelized encoder at the decoder stage, we use a channel attention module to merge the features of the encoder and propagate them through skip connections and bottlenecks. Intensive numerical experiments are performed on both aortic vessel tree, cardiac, and multi-organ datasets. By comparing with state-of-the-art medical image segmentation methods, our method is shown with better segmentation accuracy, especially on small organs. The code is publicly available on https://github.com/HongkunSun/ParaTransCNN.
CVDec 22, 2025
Total Curvature Regularization and its_Minimization for Surface and Image SmoothingTianle Lu, Ke Chen, Yuping Duan
We introduce a novel formulation for curvature regularization by penalizing normal curvatures from multiple directions. This total normal curvature regularization is capable of producing solutions with sharp edges and precise isotropic properties. To tackle the resulting high-order nonlinear optimization problem, we reformulate it as the task of finding the steady-state solution of a time-dependent partial differential equation (PDE) system. Time discretization is achieved through operator splitting, where each subproblem at the fractional steps either has a closed-form solution or can be efficiently solved using advanced algorithms. Our method circumvents the need for complex parameter tuning and demonstrates robustness to parameter choices. The efficiency and effectiveness of our approach have been rigorously validated in the context of surface and image smoothing problems.
AISep 22, 2024
Can Large Language Models Logically Predict Myocardial Infarction? Evaluation based on UK Biobank CohortYuxing Zhi, Yuan Guo, Kai Yuan et al.
Background: Large language models (LLMs) have seen extraordinary advances with applications in clinical decision support. However, high-quality evidence is urgently needed on the potential and limitation of LLMs in providing accurate clinical decisions based on real-world medical data. Objective: To evaluate quantitatively whether universal state-of-the-art LLMs (ChatGPT and GPT-4) can predict the incidence risk of myocardial infarction (MI) with logical inference, and to further make comparison between various models to assess the performance of LLMs comprehensively. Methods: In this retrospective cohort study, 482,310 participants recruited from 2006 to 2010 were initially included in UK Biobank database and later on resampled into a final cohort of 690 participants. For each participant, tabular data of the risk factors of MI were transformed into standardized textual descriptions for ChatGPT recognition. Responses were generated by asking ChatGPT to select a score ranging from 0 to 10 representing the risk. Chain of Thought (CoT) questioning was used to evaluate whether LLMs make prediction logically. The predictive performance of ChatGPT was compared with published medical indices, traditional machine learning models and other large language models. Conclusions: Current LLMs are not ready to be applied in clinical medicine fields. Future medical LLMs are suggested to be expert in medical domain knowledge to understand both natural languages and quantified medical data, and further make logical inferences.
CVSep 22, 2024
Anisotropic Diffusion Probabilistic Model for Imbalanced Image ClassificationJingyu Kong, Yuan Guo, Yu Wang et al.
Real-world data often has a long-tailed distribution, where the scarcity of tail samples significantly limits the model's generalization ability. Denoising Diffusion Probabilistic Models (DDPM) are generative models based on stochastic differential equation theory and have demonstrated impressive performance in image classification tasks. However, existing diffusion probabilistic models do not perform satisfactorily in classifying tail classes. In this work, we propose the Anisotropic Diffusion Probabilistic Model (ADPM) for imbalanced image classification problems. We utilize the data distribution to control the diffusion speed of different class samples during the forward process, effectively improving the classification accuracy of the denoiser in the reverse process. Specifically, we provide a theoretical strategy for selecting noise levels for different categories in the diffusion process based on error analysis theory to address the imbalanced classification problem. Furthermore, we integrate global and local image prior in the forward process to enhance the model's discriminative ability in the spatial dimension, while incorporate semantic-level contextual information in the reverse process to boost the model's discriminative power and robustness. Through comparisons with state-of-the-art methods on four medical benchmark datasets, we validate the effectiveness of the proposed method in handling long-tail data. Our results confirm that the anisotropic diffusion model significantly improves the classification accuracy of rare classes while maintaining the accuracy of head classes. On the skin lesion datasets, PAD-UFES and HAM10000, the F1-scores of our method improved by 4% and 3%, respectively compared to the original diffusion probabilistic model.
CVDec 1, 2024
LUIEO: A Lightweight Model for Integrating Underwater Image Enhancement and Object DetectionBin Li, Li Li, Zhenwei Zhang et al.
Underwater optical images inevitably suffer from various degradation factors such as blurring, low contrast, and color distortion, which hinder the accuracy of object detection tasks. Due to the lack of paired underwater/clean images, most research methods adopt a strategy of first enhancing and then detecting, resulting in a lack of feature communication between the two learning tasks. On the other hand, due to the contradiction between the diverse degradation factors of underwater images and the limited number of samples, existing underwater enhancement methods are difficult to effectively enhance degraded images of unknown water bodies, thereby limiting the improvement of object detection accuracy. Therefore, most underwater target detection results are still displayed on degraded images, making it difficult to visually judge the correctness of the detection results. To address the above issues, this paper proposes a multi-task learning method that simultaneously enhances underwater images and improves detection accuracy. Compared with single-task learning, the integrated model allows for the dynamic adjustment of information communication and sharing between different tasks. Due to the fact that real underwater images can only provide annotated object labels, this paper introduces physical constraints to ensure that object detection tasks do not interfere with image enhancement tasks. Therefore, this article introduces a physical module to decompose underwater images into clean images, background light, and transmission images and uses a physical model to calculate underwater images for self-supervision. Numerical experiments demonstrate that the proposed model achieves satisfactory results in visual performance, object detection accuracy, and detection efficiency compared to state-of-the-art comparative methods.
LGOct 9, 2025
Deep Neural Networks Inspired by Differential EquationsYongshuai Liu, Lianfang Wang, Kuilin Qin et al.
Deep learning has become a pivotal technology in fields such as computer vision, scientific computing, and dynamical systems, significantly advancing these disciplines. However, neural Networks persistently face challenges related to theoretical understanding, interpretability, and generalization. To address these issues, researchers are increasingly adopting a differential equations perspective to propose a unified theoretical framework and systematic design methodologies for neural networks. In this paper, we provide an extensive review of deep neural network architectures and dynamic modeling methods inspired by differential equations. We specifically examine deep neural network models and deterministic dynamical network constructs based on ordinary differential equations (ODEs), as well as regularization techniques and stochastic dynamical network models informed by stochastic differential equations (SDEs). We present numerical comparisons of these models to illustrate their characteristics and performance. Finally, we explore promising research directions in integrating differential equations with deep learning to offer new insights for developing intelligent computational methods that boast enhanced interpretability and generalization capabilities.
CVAug 13, 2025
Noise-adapted Neural Operator for Robust Non-Line-of-Sight ImagingLianfang Wang, Kuilin Qin, Xueying Liu et al.
Computational imaging, especially non-line-of-sight (NLOS) imaging, the extraction of information from obscured or hidden scenes is achieved through the utilization of indirect light signals resulting from multiple reflections or scattering. The inherently weak nature of these signals, coupled with their susceptibility to noise, necessitates the integration of physical processes to ensure accurate reconstruction. This paper presents a parameterized inverse problem framework tailored for large-scale linear problems in 3D imaging reconstruction. Initially, a noise estimation module is employed to adaptively assess the noise levels present in transient data. Subsequently, a parameterized neural operator is developed to approximate the inverse mapping, facilitating end-to-end rapid image reconstruction. Our 3D image reconstruction framework, grounded in operator learning, is constructed through deep algorithm unfolding, which not only provides commendable model interpretability but also enables dynamic adaptation to varying noise levels in the acquired data, thereby ensuring consistently robust and accurate reconstruction outcomes. Furthermore, we introduce a novel method for the fusion of global and local spatiotemporal data features. By integrating structural and detailed information, this method significantly enhances both accuracy and robustness. Comprehensive numerical experiments conducted on both simulated and real datasets substantiate the efficacy of the proposed method. It demonstrates remarkable performance with fast scanning data and sparse illumination point data, offering a viable solution for NLOS imaging in complex scenarios.
CVApr 17, 2025
Contour Field based Elliptical Shape Prior for the Segment Anything ModelXinyu Zhao, Jun Liu, Faqiang Wang et al.
The elliptical shape prior information plays a vital role in improving the accuracy of image segmentation for specific tasks in medical and natural images. Existing deep learning-based segmentation methods, including the Segment Anything Model (SAM), often struggle to produce segmentation results with elliptical shapes efficiently. This paper proposes a new approach to integrate the prior of elliptical shapes into the deep learning-based SAM image segmentation techniques using variational methods. The proposed method establishes a parameterized elliptical contour field, which constrains the segmentation results to align with predefined elliptical contours. Utilizing the dual algorithm, the model seamlessly integrates image features with elliptical priors and spatial regularization priors, thereby greatly enhancing segmentation accuracy. By decomposing SAM into four mathematical sub-problems, we integrate the variational ellipse prior to design a new SAM network structure, ensuring that the segmentation output of SAM consists of elliptical regions. Experimental results on some specific image datasets demonstrate an improvement over the original SAM.
CVMar 23, 2025
Geometric Constrained Non-Line-of-Sight ImagingXueying Liu, Lianfang Wang, Jun Liu et al.
Normal reconstruction is crucial in non-line-of-sight (NLOS) imaging, as it provides key geometric and lighting information about hidden objects, which significantly improves reconstruction accuracy and scene understanding. However, jointly estimating normals and albedo expands the problem from matrix-valued functions to tensor-valued functions that substantially increasing complexity and computational difficulty. In this paper, we propose a novel joint albedo-surface reconstruction method, which utilizes the Frobenius norm of the shape operator to control the variation rate of the normal field. It is the first attempt to apply regularization methods to the reconstruction of surface normals for hidden objects. By improving the accuracy of the normal field, it enhances detail representation and achieves high-precision reconstruction of hidden object geometry. The proposed method demonstrates robustness and effectiveness on both synthetic and experimental datasets. On transient data captured within 15 seconds, our surface normal-regularized reconstruction model produces more accurate surfaces than recently proposed methods and is 30 times faster than the existing surface reconstruction approach.
IVJan 28, 2024
Low-resolution Prior Equilibrium Network for CT ReconstructionYijie Yang, Qifeng Gao, Yuping Duan
The unrolling method has been investigated for learning variational models in X-ray computed tomography. However, it has been observed that directly unrolling the regularization model through gradient descent does not produce satisfactory results. In this paper, we present a novel deep learning-based CT reconstruction model, where the low-resolution image is introduced to obtain an effective regularization term for improving the network`s robustness. Our approach involves constructing the backbone network architecture by algorithm unrolling that is realized using the deep equilibrium architecture. We theoretically discuss the convergence of the proposed low-resolution prior equilibrium model and provide the conditions to guarantee convergence. Experimental results on both sparse-view and limited-angle reconstruction problems are provided, demonstrating that our end-to-end low-resolution prior equilibrium model outperforms other state-of-the-art methods in terms of noise reduction, contrast-to-noise ratio, and preservation of edge details.
OCOct 18, 2019
Bilinear Constraint based ADMM for Mixed Poisson-Gaussian Noise RemovalJie Zhang, Yuping Duan, Yue Lu et al.
In this paper, we propose new operator-splitting algorithms for the total variation regularized infimal convolution (TV-IC) model [4] in order to remove mixed Poisson-Gaussian(MPG) noise. In the existing splitting algorithm for TV-IC, an inner loop by Newton method had to be adopted for one nonlinear optimization subproblem, which increased the computation cost per outer loop. By introducing a new bilinear constraint and applying the alternating direction method of multipliers (ADMM), all subproblems of the proposed algorithms named as BCA (short for Bilinear Constraint based ADMM algorithm) and BCAf(short for a variant of BCA with fully splitting form) can be very efficiently solved; especially for the proposed BCAf, they can be calculated without any inner iterations. Under mild conditions, the convergence of the proposed BCA is investigated. Numerically, compared to existing primal-dual algorithms for the TV-IC model, the proposed algorithms, with fewer tunable parameters, converge much faster and produce comparable results meanwhile.