Dong Liang

h-index47

46papers

1,047citations

Novelty52%

AI Score53

Ranked #12,819 of 194,257 authors (top 7%)#70 in IV (top 2%)

46 Papers

29.5IVAug 10, 2022Code

High-Frequency Space Diffusion Models for Accelerated MRI

Chentao Cao, Zhuo-Xu Cui, Yue Wang et al.

Diffusion models with continuous stochastic differential equations (SDEs) have shown superior performances in image generation. It can serve as a deep generative prior to solving the inverse problem in magnetic resonance (MR) reconstruction. However, low-frequency regions of $k$-space data are typically fully sampled in fast MR imaging, while existing diffusion models are performed throughout the entire image or $k$-space, inevitably introducing uncertainty in the reconstruction of low-frequency regions. Additionally, existing diffusion models often demand substantial iterations to converge, resulting in time-consuming reconstructions. To address these challenges, we propose a novel SDE tailored specifically for MR reconstruction with the diffusion process in high-frequency space (referred to as HFS-SDE). This approach ensures determinism in the fully sampled low-frequency regions and accelerates the sampling procedure of reverse diffusion. Experiments conducted on the publicly available fastMRI dataset demonstrate that the proposed HFS-SDE method outperforms traditional parallel imaging methods, supervised deep learning, and existing diffusion models in terms of reconstruction accuracy and stability. The fast convergence properties are also confirmed through theoretical and experimental validation. Our code and weights are available at https://github.com/Aboriginer/HFS-SDE.

21.4IVAug 31, 2023Code

Improving Lens Flare Removal with General Purpose Pipeline and Multiple Light Sources Recovery

Yuyan Zhou, Dong Liang, Songcan Chen et al.

When taking images against strong light sources, the resulting images often contain heterogeneous flare artifacts. These artifacts can importantly affect image visual quality and downstream computer vision tasks. While collecting real data pairs of flare-corrupted/flare-free images for training flare removal models is challenging, current methods utilize the direct-add approach to synthesize data. However, these methods do not consider automatic exposure and tone mapping in image signal processing pipeline (ISP), leading to the limited generalization capability of deep models training using such data. Besides, existing methods struggle to handle multiple light sources due to the different sizes, shapes and illuminance of various light sources. In this paper, we propose a solution to improve the performance of lens flare removal by revisiting the ISP and remodeling the principle of automatic exposure in the synthesis pipeline and design a more reliable light sources recovery strategy. The new pipeline approaches realistic imaging by discriminating the local and global illumination through convex combination, avoiding global illumination shifting and local over-saturation. Our strategy for recovering multiple light sources convexly averages the input and output of the neural network based on illuminance levels, thereby avoiding the need for a hard threshold in identifying light sources. We also contribute a new flare removal testing dataset containing the flare-corrupted images captured by ten types of consumer electronics. The dataset facilitates the verification of the generalization capability of flare removal methods. Extensive experiments show that our solution can effectively improve the performance of lens flare removal and push the frontier toward more general situations.

5.0CVApr 28, 2023

ALL-E: Aesthetics-guided Low-light Image Enhancement

Ling Li, Dong Liang, Yuanhang Gao et al.

Evaluating the performance of low-light image enhancement (LLE) is highly subjective, thus making integrating human preferences into image enhancement a necessity. Existing methods fail to consider this and present a series of potentially valid heuristic criteria for training enhancement models. In this paper, we propose a new paradigm, i.e., aesthetics-guided low-light image enhancement (ALL-E), which introduces aesthetic preferences to LLE and motivates training in a reinforcement learning framework with an aesthetic reward. Each pixel, functioning as an agent, refines itself by recursive actions, i.e., its corresponding adjustment curve is estimated sequentially. Extensive experiments show that integrating aesthetic assessment improves both subjective experience and objective evaluation. Our results on various benchmarks demonstrate the superiority of ALL-E over state-of-the-art methods.

9.5IVMar 21, 2022Code

K-space and Image Domain Collaborative Energy based Model for Parallel MRI Reconstruction

Zongjiang Tu, Chen Jiang, Yu Guan et al.

Decreasing magnetic resonance (MR) image acquisition times can potentially make MR examinations more accessible. Prior arts including the deep learning models have been devoted to solving the problem of long MRI imaging time. Recently, deep generative models have exhibited great potentials in algorithm robustness and usage flexibility. Nevertheless, none of existing schemes can be learned or employed to the k-space measurement directly. Furthermore, how do the deep generative models work well in hybrid domain is also worth being investigated. In this work, by taking advantage of the deep energy-based models, we propose a k-space and image domain collaborative generative model to comprehensively estimate the MR data from under-sampled measurement. Experimental comparisons with the state-of-the-arts demonstrated that the proposed hybrid method has less error in reconstruction accuracy and is more stable under different acceleration factors

6.8CVSep 6, 2023

Patched Line Segment Learning for Vector Road Mapping

Jiakun Xu, Bowen Xu, Gui-Song Xia et al.

This paper presents a novel approach to computing vector road maps from satellite remotely sensed images, building upon a well-defined Patched Line Segment (PaLiS) representation for road graphs that holds geometric significance. Unlike prevailing methods that derive road vector representations from satellite images using binary masks or keypoints, our method employs line segments. These segments not only convey road locations but also capture their orientations, making them a robust choice for representation. More precisely, given an input image, we divide it into non-overlapping patches and predict a suitable line segment within each patch. This strategy enables us to capture spatial and structural cues from these patch-based line segments, simplifying the process of constructing the road network graph without the necessity of additional neural networks for connectivity. In our experiments, we demonstrate how an effective representation of a road graph significantly enhances the performance of vector road mapping on established benchmarks, without requiring extensive modifications to the neural network architecture. Furthermore, our method achieves state-of-the-art performance with just 6 GPU hours of training, leading to a substantial 32-fold reduction in training costs in terms of GPU hours.

12.8IVMay 8, 2022Code

WKGM: Weight-K-space Generative Model for Parallel Imaging Reconstruction

Zongjiang Tu, Die Liu, Xiaoqing Wang et al.

Deep learning based parallel imaging (PI) has made great progresses in recent years to accelerate magnetic resonance imaging (MRI). Nevertheless, it still has some limitations, such as the robustness and flexibility of existing methods have great deficiency. In this work, we propose a method to explore the k-space domain learning via robust generative modeling for flexible calibration-less PI reconstruction, coined weight-k-space generative model (WKGM). Specifically, WKGM is a generalized k-space domain model, where the k-space weighting technology and high-dimensional space augmentation design are efficiently incorporated for score-based generative model training, resulting in good and robust reconstructions. In addition, WKGM is flexible and thus can be synergistically combined with various traditional k-space PI models, which can make full use of the correlation between multi-coil data and realizecalibration-less PI. Even though our model was trained on only 500 images, experimental results with varying sampling patterns and acceleration factors demonstrate that WKGM can attain state-of-the-art reconstruction results with the well-learned k-space generative prior.

22.4IVSep 2, 2022

Self-Score: Self-Supervised Learning on Score-Based Models for MRI Reconstruction

Zhuo-Xu Cui, Chentao Cao, Shaonan Liu et al.

Recently, score-based diffusion models have shown satisfactory performance in MRI reconstruction. Most of these methods require a large amount of fully sampled MRI data as a training set, which, sometimes, is difficult to acquire in practice. This paper proposes a fully-sampled-data-free score-based diffusion model for MRI reconstruction, which learns the fully sampled MR image prior in a self-supervised manner on undersampled data. Specifically, we first infer the fully sampled MR image distribution from the undersampled data by Bayesian deep learning, then perturb the data distribution and approximate their probability density gradient by training a score function. Leveraging the learned score function as a prior, we can reconstruct the MR image by performing conditioned Langevin Markov chain Monte Carlo (MCMC) sampling. Experiments on the public dataset show that the proposed method outperforms existing self-supervised MRI reconstruction methods and achieves comparable performances with the conventional (fully sampled data trained) score-based diffusion methods.

13.7IVAug 15, 2022Code

One-shot Generative Prior in Hankel-k-space for Parallel Imaging Reconstruction

Hong Peng, Chen Jiang, Jing Cheng et al.

Magnetic resonance imaging serves as an essential tool for clinical diagnosis. However, it suffers from a long acquisition time. The utilization of deep learning, especially the deep generative models, offers aggressive acceleration and better reconstruction in magnetic resonance imaging. Nevertheless, learning the data distribution as prior knowledge and reconstructing the image from limited data remains challenging. In this work, we propose a novel Hankel-k-space generative model (HKGM), which can generate samples from a training set of as little as one k-space data. At the prior learning stage, we first construct a large Hankel matrix from k-space data, then extract multiple structured k-space patches from the large Hankel matrix to capture the internal distribution among different patches. Extracting patches from a Hankel matrix enables the generative model to be learned from redundant and low-rank data space. At the iterative reconstruction stage, it is observed that the desired solution obeys the learned prior knowledge. The intermediate reconstruction solution is updated by taking it as the input of the generative model. The updated result is then alternatively operated by imposing low-rank penalty on its Hankel matrix and data consistency con-strain on the measurement data. Experimental results confirmed that the internal statistics of patches within a single k-space data carry enough information for learning a powerful generative model and provide state-of-the-art reconstruction.

9.1CVApr 11, 2023

SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI

Zhuo-Xu Cui, Chentao Cao, Yue Wang et al.

Diffusion models have emerged as a leading methodology for image generation and have proven successful in the realm of magnetic resonance imaging (MRI) reconstruction. However, existing reconstruction methods based on diffusion models are primarily formulated in the image domain, making the reconstruction quality susceptible to inaccuracies in coil sensitivity maps (CSMs). k-space interpolation methods can effectively address this issue but conventional diffusion models are not readily applicable in k-space interpolation. To overcome this challenge, we introduce a novel approach called SPIRiT-Diffusion, which is a diffusion model for k-space interpolation inspired by the iterative self-consistent SPIRiT method. Specifically, we utilize the iterative solver of the self-consistent term (i.e., k-space physical prior) in SPIRiT to formulate a novel stochastic differential equation (SDE) governing the diffusion process. Subsequently, k-space data can be interpolated by executing the diffusion process. This innovative approach highlights the optimization model's role in designing the SDE in diffusion models, enabling the diffusion process to align closely with the physics inherent in the optimization model, a concept referred to as model-driven diffusion. We evaluated the proposed SPIRiT-Diffusion method using a 3D joint intracranial and carotid vessel wall imaging dataset. The results convincingly demonstrate its superiority over image-domain reconstruction methods, achieving high reconstruction quality even at a substantial acceleration rate of 10.

1.5CVMar 24, 2023

Search By Image: Deeply Exploring Beneficial Features for Beauty Product Retrieval

Mingqiang Wei, Qian Sun, Haoran Xie et al.

Searching by image is popular yet still challenging due to the extensive interference arose from i) data variations (e.g., background, pose, visual angle, brightness) of real-world captured images and ii) similar images in the query dataset. This paper studies a practically meaningful problem of beauty product retrieval (BPR) by neural networks. We broadly extract different types of image features, and raise an intriguing question that whether these features are beneficial to i) suppress data variations of real-world captured images, and ii) distinguish one image from others which look very similar but are intrinsically different beauty products in the dataset, therefore leading to an enhanced capability of BPR. To answer it, we present a novel variable-attention neural network to understand the combination of multiple features (termed VM-Net) of beauty product images. Considering that there are few publicly released training datasets for BPR, we establish a new dataset with more than one million images classified into more than 20K categories to improve both the generalization and anti-interference abilities of VM-Net and other methods. We verify the performance of VM-Net and its competitors on the benchmark dataset Perfect-500K, where VM-Net shows clear improvements over the competitors in terms of MAP@7. The source code and dataset will be released upon publication.

8.1IVDec 14, 2022

SPIRiT-Diffusion: SPIRiT-driven Score-Based Generative Modeling for Vessel Wall imaging

Chentao Cao, Zhuo-Xu Cui, Jing Cheng et al.

Diffusion model is the most advanced method in image generation and has been successfully applied to MRI reconstruction. However, the existing methods do not consider the characteristics of multi-coil acquisition of MRI data. Therefore, we give a new diffusion model, called SPIRiT-Diffusion, based on the SPIRiT iterative reconstruction algorithm. Specifically, SPIRiT-Diffusion characterizes the prior distribution of coil-by-coil images by score matching and characterizes the k-space redundant prior between coils based on self-consistency. With sufficient prior constraint utilized, we achieve superior reconstruction results on the joint Intracranial and Carotid Vessel Wall imaging dataset.

2.7IVDec 15, 2022

Universal Generative Modeling in Dual-domain for Dynamic MR Imaging

Chuanming Yu, Yu Guan, Ziwen Ke et al.

Dynamic magnetic resonance image reconstruction from incomplete k-space data has generated great research interest due to its capability to reduce scan time. Never-theless, the reconstruction problem is still challenging due to its ill-posed nature. Recently, diffusion models espe-cially score-based generative models have exhibited great potential in algorithm robustness and usage flexi-bility. Moreover, the unified framework through the variance exploding stochastic differential equation (VE-SDE) is proposed to enable new sampling methods and further extend the capabilities of score-based gener-ative models. Therefore, by taking advantage of the uni-fied framework, we proposed a k-space and image Du-al-Domain collaborative Universal Generative Model (DD-UGM) which combines the score-based prior with low-rank regularization penalty to reconstruct highly under-sampled measurements. More precisely, we extract prior components from both image and k-space domains via a universal generative model and adaptively handle these prior components for faster processing while maintaining good generation quality. Experimental comparisons demonstrated the noise reduction and detail preservation abilities of the proposed method. Much more than that, DD-UGM can reconstruct data of differ-ent frames by only training a single frame image, which reflects the flexibility of the proposed model.

3.9CVSep 26, 2023

InvKA: Gait Recognition via Invertible Koopman Autoencoder

Fan Li, Dong Liang, Jing Lian et al.

Most current gait recognition methods suffer from poor interpretability and high computational cost. To improve interpretability, we investigate gait features in the embedding space based on Koopman operator theory. The transition matrix in this space captures complex kinematic features of gait cycles, namely the Koopman operator. The diagonal elements of the operator matrix can represent the overall motion trend, providing a physically meaningful descriptor. To reduce the computational cost of our algorithm, we use a reversible autoencoder to reduce the model size and eliminate convolutional layers to compress its depth, resulting in fewer floating-point operations. Experimental results on multiple datasets show that our method reduces computational cost to 1% compared to state-of-the-art methods while achieving competitive recognition accuracy 98% on non-occlusion datasets.

5.3IVJun 8, 2023

Connectional-Style-Guided Contextual Representation Learning for Brain Disease Diagnosis

Gongshu Wang, Ning Jiang, Yunxiao Ma et al.

Structural magnetic resonance imaging (sMRI) has shown great clinical value and has been widely used in deep learning (DL) based computer-aided brain disease diagnosis. Previous approaches focused on local shapes and textures in sMRI that may be significant only within a particular domain. The learned representations are likely to contain spurious information and have a poor generalization ability in other diseases and datasets. To facilitate capturing meaningful and robust features, it is necessary to first comprehensively understand the intrinsic pattern of the brain that is not restricted within a single data/task domain. Considering that the brain is a complex connectome of interlinked neurons, the connectional properties in the brain have strong biological significance, which is shared across multiple domains and covers most pathological information. In this work, we propose a connectional style contextual representation learning model (CS-CRL) to capture the intrinsic pattern of the brain, used for multiple brain disease diagnosis. Specifically, it has a vision transformer (ViT) encoder and leverages mask reconstruction as the proxy task and Gram matrices to guide the representation of connectional information. It facilitates the capture of global context and the aggregation of features with biological plausibility. The results indicate that CS-CRL achieves superior accuracy in multiple brain disease diagnosis tasks across six datasets and three diseases and outperforms state-of-the-art models. Furthermore, we demonstrate that CS-CRL captures more brain-network-like properties, better aggregates features, is easier to optimize and is more robust to noise, which explains its superiority in theory. Our source code will be released soon.

6.5CVAug 11, 2022Code

K-UNN: k-Space Interpolation With Untrained Neural Network

Zhuo-Xu Cui, Sen Jia, Qingyong Zhu et al.

Recently, untrained neural networks (UNNs) have shown satisfactory performances for MR image reconstruction on random sampling trajectories without using additional full-sampled training data. However, the existing UNN-based approach does not fully use the MR image physical priors, resulting in poor performance in some common scenarios (e.g., partial Fourier, regular sampling, etc.) and the lack of theoretical guarantees for reconstruction accuracy. To bridge this gap, we propose a safeguarded k-space interpolation method for MRI using a specially designed UNN with a tripled architecture driven by three physical priors of the MR images (or k-space data), including sparsity, coil sensitivity smoothness, and phase smoothness. We also prove that the proposed method guarantees tight bounds for interpolated k-space data accuracy. Finally, ablation experiments show that the proposed method can more accurately characterize the physical priors of MR images than existing traditional methods. Additionally, under a series of commonly used sampling trajectories, experiments also show that the proposed method consistently outperforms traditional parallel imaging methods and existing UNNs, and even outperforms the state-of-the-art supervised-trained k-space deep learning methods in some cases.

3.9CVSep 17, 2023

Convex Latent-Optimized Adversarial Regularizers for Imaging Inverse Problems

Huayu Wang, Chen Luo, Taofeng Xie et al.

Recently, data-driven techniques have demonstrated remarkable effectiveness in addressing challenges related to MR imaging inverse problems. However, these methods still exhibit certain limitations in terms of interpretability and robustness. In response, we introduce Convex Latent-Optimized Adversarial Regularizers (CLEAR), a novel and interpretable data-driven paradigm. CLEAR represents a fusion of deep learning (DL) and variational regularization. Specifically, we employ a latent optimization technique to adversarially train an input convex neural network, and its set of minima can fully represent the real data manifold. We utilize it as a convex regularizer to formulate a CLEAR-informed variational regularization model that guides the solution of the imaging inverse problem on the real data manifold. Leveraging its inherent convexity, we have established the convergence of the projected subgradient descent algorithm for the CLEAR-informed regularization model. This convergence guarantees the attainment of a unique solution to the imaging inverse problem, subject to certain assumptions. Furthermore, we have demonstrated the robustness of our CLEAR-informed model, explicitly showcasing its capacity to achieve stable reconstruction even in the presence of measurement interference. Finally, we illustrate the superiority of our approach using MRI reconstruction as an example. Our method consistently outperforms conventional data-driven techniques and traditional regularization approaches, excelling in both reconstruction quality and robustness.

6.5CVJul 4, 2024Code

Relative Difficulty Distillation for Semantic Segmentation

Dong Liang, Yue Sun, Yun Du et al.

Current knowledge distillation (KD) methods primarily focus on transferring various structured knowledge and designing corresponding optimization goals to encourage the student network to imitate the output of the teacher network. However, introducing too many additional optimization objectives may lead to unstable training, such as gradient conflicts. Moreover, these methods ignored the guidelines of relative learning difficulty between the teacher and student networks. Inspired by human cognitive science, in this paper, we redefine knowledge from a new perspective -- the student and teacher networks' relative difficulty of samples, and propose a pixel-level KD paradigm for semantic segmentation named Relative Difficulty Distillation (RDD). We propose a two-stage RDD framework: Teacher-Full Evaluated RDD (TFE-RDD) and Teacher-Student Evaluated RDD (TSE-RDD). RDD allows the teacher network to provide effective guidance on learning focus without additional optimization goals, thus avoiding adjusting learning weights for multiple losses. Extensive experimental evaluations using a general distillation loss function on popular datasets such as Cityscapes, CamVid, Pascal VOC, and ADE20k demonstrate the effectiveness of RDD against state-of-the-art KD methods. Additionally, our research showcases that RDD can integrate with existing KD methods to improve their upper performance bound.

17.2CVDec 13, 2021Code

Semantically Contrastive Learning for Low-light Image Enhancement

Dong Liang, Ling Li, Mingqiang Wei et al.

Low-light image enhancement (LLE) remains challenging due to the unfavorable prevailing low-contrast and weak-visibility problems of single RGB images. In this paper, we respond to the intriguing learning-related question -- if leveraging both accessible unpaired over/underexposed images and high-level semantic guidance, can improve the performance of cutting-edge LLE models? Here, we propose an effective semantically contrastive learning paradigm for LLE (namely SCL-LLE). Beyond the existing LLE wisdom, it casts the image enhancement task as multi-task joint learning, where LLE is converted into three constraints of contrastive learning, semantic brightness consistency, and feature preservation for simultaneously ensuring the exposure, texture, and color consistency. SCL-LLE allows the LLE model to learn from unpaired positives (normal-light)/negatives (over/underexposed), and enables it to interact with the scene semantics to regularize the image enhancement network, yet the interaction of high-level semantic knowledge and the low-level signal prior is seldom investigated in previous methods. Training on readily available open data, extensive experiments demonstrate that our method surpasses the state-of-the-arts LLE models over six independent cross-scenes datasets. Moreover, SCL-LLE's potential to benefit the downstream semantic segmentation under extremely dark conditions is discussed. Source Code: https://github.com/LingLIx/SCL-LLE.

7.3CVMar 21, 2021Code

Learning Calibrated-Guidance for Object Detection in Aerial Images

Zongqi Wei, Dong Liang, Dong Zhang et al.

Object detection is one of the most fundamental yet challenging research topics in the domain of computer vision. Recently, the study on this topic in aerial images has made tremendous progress. However, complex background and worse imaging quality are obvious problems in aerial object detection. Most state-of-the-art approaches tend to develop elaborate attention mechanisms for the space-time feature calibrations with arduous computational complexity, while surprisingly ignoring the importance of feature calibrations in channel-wise. In this work, we propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance channel communications in a feature transformer fashion, which can adaptively determine the calibration weights for each channel based on the global feature affinity correlations. Specifically, for a given set of feature maps, CG first computes the feature similarity between each channel and the remaining channels as the intermediary calibration guidance. Then, re-representing each channel by aggregating all the channels weighted together via the guidance operation. Our CG is a general module that can be plugged into any deep neural networks, which is named as CG-Net. To demonstrate its effectiveness and efficiency, extensive experiments are carried out on both oriented object detection task and horizontal object detection task in aerial images. Experimental results on two challenging benchmarks (DOTA and HRSC2016) demonstrate that our CG-Net can achieve the new state-of-the-art performance in accuracy with a fair computational overhead. The source code has been open sourced at https://github.com/WeiZongqi/CG-Net

1.2MED-PHAug 27, 2024

Sequential-Scanning Dual-Energy CT Imaging Using High Temporal Resolution Image Reconstruction and Error-Compensated Material Basis Image Generation

Qiaoxin Li, Ruifeng Chen, Peng Wang et al.

Dual-energy computed tomography (DECT) has been widely used to obtain quantitative elemental composition of imaged subjects for personalized and precise medical diagnosis. Compared with DECT leveraging advanced X-ray source and/or detector technologies, the use of the sequential-scanning data acquisition scheme to implement DECT may make a broader impact on clinical practice because this scheme requires no specialized hardware designs and can be directly implemented into conventional CT systems. However, since the concentration of iodinated contrast agent in the imaged subject varies over time, sequentially scanned data sets acquired at two tube potentials are temporally inconsistent. As existing material basis image reconstruction approaches assume that the data sets acquired at two tube potentials are temporally consistent, the violation of this assumption results in inaccurate quantification of material concentration. In this work, we developed sequential-scanning DECT imaging using high temporal resolution image reconstruction and error-compensated material basis image generation, ACCELERATION in short, to address the technical challenge induced by temporal inconsistency of sequentially scanned data sets and improve quantification accuracy of material concentration in sequential-scanning DECT. ACCELERATION has been validated and evaluated using numerical simulation data sets generated from clinical human subject exams and experimental human subject studies. Results demonstrated the improvement of quantification accuracy and image quality using ACCELERATION.

20.9CVMar 22, 2024Code

Transfer CLIP for Generalizable Image Denoising

Jun Cheng, Dong Liang, Shan Tan

Image denoising is a fundamental task in computer vision. While prevailing deep learning-based supervised and self-supervised methods have excelled in eliminating in-distribution noise, their susceptibility to out-of-distribution (OOD) noise remains a significant challenge. The recent emergence of contrastive language-image pre-training (CLIP) model has showcased exceptional capabilities in open-world image recognition and segmentation. Yet, the potential for leveraging CLIP to enhance the robustness of low-level tasks remains largely unexplored. This paper uncovers that certain dense features extracted from the frozen ResNet image encoder of CLIP exhibit distortion-invariant and content-related properties, which are highly desirable for generalizable denoising. Leveraging these properties, we devise an asymmetrical encoder-decoder denoising network, which incorporates dense features including the noisy image and its multi-scale features from the frozen ResNet encoder of CLIP into a learnable image decoder to achieve generalizable denoising. The progressive feature augmentation strategy is further proposed to mitigate feature overfitting and improve the robustness of the learnable decoder. Extensive experiments and comparisons conducted across diverse OOD noises, including synthetic noise, real-world sRGB noise, and low-dose CT image noise, demonstrate the superior generalization ability of our method.

6.3IVNov 21, 2024

Guided MRI Reconstruction via Schrödinger Bridge

Yue Wang, Yuanbiao Yang, Zhuo-xu Cui et al.

Magnetic Resonance Imaging (MRI) is an inherently multi-contrast modality, where cross-contrast priors can be exploited to improve image reconstruction from undersampled data. Recently, diffusion models have shown remarkable performance in MRI reconstruction. However, they still struggle to effectively utilize such priors, mainly because existing methods rely on feature-level fusion in image or latent spaces, which lacks explicit structural correspondence and thus leads to suboptimal performance. To address this issue, we propose $\mathbf{I}^2$SB-Inversion, a multi-contrast guided reconstruction framework based on the Schrödinger Bridge (SB). The proposed method performs pixel-wise translation between paired contrasts, providing explicit structural constraints between the guidance and target images. Furthermore, an Inversion strategy is introduced to correct inter-modality misalignment, which often occurs in guided reconstruction, thereby mitigating artifacts and improving reconstruction accuracy. Experiments on paired T1- and T2-weighted datasets demonstrate that $\mathbf{I}^2$SB-Inversion achieves a high acceleration factor of up to 14.4 and consistently outperforms existing methods in both quantitative and qualitative evaluations.

7.1LGOct 22, 2025

ELUTQ: Efficient LUT-Aware Quantization for Deploying Large Language Models on Edge Devices

Xin Nie, Liang Dong, HaiCheng Zhang et al.

The deployment of Large Language Models (LLMs) on CPU-based edge devices is crucial for enabling on-device intelligence and expanding AI accessibility. However, it remains challenging due to limited memory and computational resources. During edge inference, memory usage and latency are the primary bottlenecks. Although weight quantization can effectively reduce memory consumption, existing hardware-friendly approaches often rely on uniform quantization, which poorly fits weight distributions and incurs high dequantization overhead at low bit widths. To address these limitations, we propose ELUTQ, an efficient quantization framework introducing a novel quantization format, Hierarchical Linear Quantization (HLQ). HLQ better captures the statistical characteristics of weights without increasing the computational cost of Bit-serial LUT-based GEMM operations, thereby eliminating dequantization overhead. It is orthogonal to existing quantization algorithms and can be seamlessly integrated into various quantization pipelines. For efficient on-device deployment, ELUTQ provides optimized CPU kernels for end-to-end inference. Experiments show that for LLaMA3-8B, HLQ reduces perplexity by about 8% at 3-bit and 85% at 2-bit precision under post-training quantization, completing quantization within one hour. With efficient finetuning, HLQ further improves 2-bit performance within two hours. In terms of inference efficiency, our 2-bit LLaMA2-7B achieves over 25 tokens/s on an Apple M2 chip (4 threads, batch size = 1).

3.6CVSep 4, 2025

K-Syn: K-space Data Synthesis in Ultra Low-data Regimes

Guan Yu, Zhang Jianhua, Liang Dong et al.

Owing to the inherently dynamic and complex characteristics of cardiac magnetic resonance (CMR) imaging, high-quality and diverse k-space data are rarely available in practice, which in turn hampers robust reconstruction of dynamic cardiac MRI. To address this challenge, we perform feature-level learning directly in the frequency domain and employ a temporal-fusion strategy as the generative guidance to synthesize k-space data. Specifically, leveraging the global representation capacity of the Fourier transform, the frequency domain can be considered a natural global feature space. Therefore, unlike traditional methods that use pixel-level convolution for feature learning and modeling in the image domain, this letter focuses on feature-level modeling in the frequency domain, enabling stable and rich generation even with ultra low-data regimes. Moreover, leveraging the advantages of feature-level modeling in the frequency domain, we integrate k-space data across time frames with multiple fusion strategies to steer and further optimize the generative trajectory. Experimental results demonstrate that the proposed method possesses strong generative ability in low-data regimes, indicating practical potential to alleviate data scarcity in dynamic MRI reconstruction.

3.6CVJul 24, 2025

Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models

Xingyu Qiu, Mengying Yang, Xinghua Ma et al.

EDM elucidates the unified design space of diffusion models, yet its fixed noise patterns restricted to pure Gaussian noise, limit advancements in image restoration. Our study indicates that forcibly injecting Gaussian noise corrupts the degraded images, overextends the image transformation distance, and increases restoration complexity. To address this problem, our proposed EDA Elucidates the Design space of Arbitrary-noise-based diffusion models. Theoretically, EDA expands the freedom of noise pattern while preserving the original module flexibility of EDM, with rigorous proof that increased noise complexity incurs no additional computational overhead during restoration. EDA is validated on three typical tasks: MRI bias field correction (global smooth noise), CT metal artifact reduction (global sharp noise), and natural image shadow removal (local boundary-aware noise). With only 5 sampling steps, EDA outperforms most task-specific methods and achieves state-of-the-art performance in bias field correction and shadow removal.

6.2CVJul 22, 2025

HOComp: Interaction-Aware Human-Object Composition

Dong Liang, Jinyuan Jia, Yuhao Liu et al.

While existing image-guided composition methods may help insert a foreground object onto a user-specified region of a background image, achieving natural blending inside the region with the rest of the image unchanged, we observe that these existing methods often struggle in synthesizing seamless interaction-aware compositions when the task involves human-object interactions. In this paper, we first propose HOComp, a novel approach for compositing a foreground object onto a human-centric background image, while ensuring harmonious interactions between the foreground object and the background person and their consistent appearances. Our approach includes two key designs: (1) MLLMs-driven Region-based Pose Guidance (MRPG), which utilizes MLLMs to identify the interaction region as well as the interaction type (e.g., holding and lefting) to provide coarse-to-fine constraints to the generated pose for the interaction while incorporating human pose landmarks to track action variations and enforcing fine-grained pose constraints; and (2) Detail-Consistent Appearance Preservation (DCAP), which unifies a shape-aware attention modulation mechanism, a multi-view appearance loss, and a background consistency loss to ensure consistent shapes/textures of the foreground and faithful reproduction of the background human. We then propose the first dataset, named Interaction-aware Human-Object Composition (IHOC), for the task. Experimental results on our dataset show that HOComp effectively generates harmonious human-object interactions with consistent appearances, and outperforms relevant methods qualitatively and quantitatively.

4.1LGMar 6, 2025

The day-ahead scenario generation method for new energy based on an improved conditional generative diffusion model

Changgang Wang, Wei Liu, Yu Cao et al.

In the context of the rising share of new energy generation, accurately generating new energy output scenarios is crucial for day-ahead power system scheduling. Deep learning-based scenario generation methods can address this need, but their black-box nature raises concerns about interpretability. To tackle this issue, this paper introduces a method for day-ahead new energy scenario generation based on an improved conditional generative diffusion model. This method is built on the theoretical framework of Markov chains and variational inference. It first transforms historical data into pure noise through a diffusion process, then uses conditional information to guide the denoising process, ultimately generating scenarios that satisfy the conditional distribution. Additionally, the noise table is improved to a cosine form, enhancing the quality of the generated scenarios. When applied to actual wind and solar output data, the results demonstrate that this method effectively generates new energy output scenarios with good adaptability.

8.4CVJan 10, 2025Code

StructSR: Refuse Spurious Details in Real-World Image Super-Resolution

Yachao Li, Dong Liang, Tianyu Ding et al.

Diffusion-based models have shown great promise in real-world image super-resolution (Real-ISR), but often generate content with structural errors and spurious texture details due to the empirical priors and illusions of these models. To address this issue, we introduce StructSR, a simple, effective, and plug-and-play method that enhances structural fidelity and suppresses spurious details for diffusion-based Real-ISR. StructSR operates without the need for additional fine-tuning, external model priors, or high-level semantic knowledge. At its core is the Structure-Aware Screening (SAS) mechanism, which identifies the image with the highest structural similarity to the low-resolution (LR) input in the early inference stage, allowing us to leverage it as a historical structure knowledge to suppress the generation of spurious details. By intervening in the diffusion inference process, StructSR seamlessly integrates with existing diffusion-based Real-ISR models. Our experimental results demonstrate that StructSR significantly improves the fidelity of structure and texture, improving the PSNR and SSIM metrics by an average of 5.27% and 9.36% on a synthetic dataset (DIV2K-Val) and 4.13% and 8.64% on two real-world datasets (RealSR and DRealSR) when integrated with four state-of-the-art diffusion-based Real-ISR methods.

3.6IVDec 6, 2024

Reconstructing Quantitative Cerebral Perfusion Images Directly From Measured Sinogram Data Acquired Using C-arm Cone-Beam CT

Haotian Zhao, Ruifeng Chen, Jing Yan et al.

To shorten the door-to-puncture time for better treating patients with acute ischemic stroke, it is highly desired to obtain quantitative cerebral perfusion images using C-arm cone-beam computed tomography (CBCT) equipped in the interventional suite. However, limited by the slow gantry rotation speed, the temporal resolution and temporal sampling density of typical C-arm CBCT are much poorer than those of multi-detector-row CT in the diagnostic imaging suite. The current quantitative perfusion imaging includes two cascaded steps: time-resolved image reconstruction and perfusion parametric estimation. For time-resolved image reconstruction, the technical challenge imposed by poor temporal resolution and poor sampling density causes inaccurate quantification of the temporal variation of cerebral artery and tissue attenuation values. For perfusion parametric estimation, it remains a technical challenge to appropriately design the handcrafted regularization for better solving the associated deconvolution problem. These two challenges together prevent obtaining quantitatively accurate perfusion images using C-arm CBCT. The purpose of this work is to simultaneously address these two challenges by combining the two cascaded steps into a single joint optimization problem and reconstructing quantitative perfusion images directly from the measured sinogram data. In the developed direct cerebral perfusion parametric image reconstruction technique, TRAINER in short, the quantitative perfusion images have been represented as a subject-specific conditional generative model trained under the constraint of the time-resolved CT forward model, perfusion convolutional model, and the subject's own measured sinogram data. Results shown in this paper demonstrated that using TRAINER, quantitative cerebral perfusion images can be accurately obtained using C-arm CBCT in the interventional suite.

19.2IVSep 2, 2023Code

Yu Guan, Chuanming Yu, Shiyu Lu et al.

Most existing MRI reconstruction methods perform tar-geted reconstruction of the entire MR image without tak-ing specific tissue regions into consideration. This may fail to emphasize the reconstruction accuracy on im-portant tissues for diagnosis. In this study, leveraging a combination of the properties of k-space data and the diffusion process, our novel scheme focuses on mining the multi-frequency prior with different strategies to pre-serve fine texture details in the reconstructed image. In addition, a diffusion process can converge more quickly if its target distribution closely resembles the noise distri-bution in the process. This can be accomplished through various high-frequency prior extractors. The finding further solidifies the effectiveness of the score-based gen-erative model. On top of all the advantages, our method improves the accuracy of MRI reconstruction and accel-erates sampling process. Experimental results verify that the proposed method successfully obtains more accurate reconstruction and outperforms state-of-the-art methods.

2.4OCMay 25, 2023

Nonlinear Bipartite Output Regulation with Application to Turing Pattern

Dong Liang, Martin Guay, Shimin Wang

In this paper, a bipartite output regulation problem is solved for a class of nonlinear multi-agent systems subject to static signed communication networks. A nonlinear distributed observer is proposed for a nonlinear exosystem with cooperation-competition interactions to address the problem. Sufficient conditions are provided to guarantee its existence and stability. The exponential stability of the observer is established. As a practical application, a leader-following bipartite consensus problem is solved for a class of nonlinear multi-agent systems based on the observer. Finally, a network of multiple pendulum systems is treated to support the feasibility of the proposed design. The possible application of the approach to generate specific Turing patterns is also presented.

5.5LGDec 18, 2021

Equilibrated Zeroth-Order Unrolled Deep Networks for Accelerated MRI

Zhuo-Xu Cui, Jing Cheng, Qingyong Zhu et al.

Recently, model-driven deep learning unrolls a certain iterative algorithm of a regularization model into a cascade network by replacing the first-order information (i.e., (sub)gradient or proximal operator) of the regularizer with a network module, which appears more explainable and predictable compared to common data-driven networks. Conversely, in theory, there is not necessarily such a functional regularizer whose first-order information matches the replaced network module, which means the network output may not be covered by the original regularization model. Moreover, up to now, there is also no theory to guarantee the global convergence and robustness (regularity) of unrolled networks under realistic assumptions. To bridge this gap, this paper propose to present a safeguarded methodology on network unrolling. Specifically, focusing on accelerated MRI, we unroll a zeroth-order algorithm, of which the network module represents the regularizer itself, so that the network output can be still covered by the regularization model. Furthermore, inspired by the ideal of deep equilibrium models, before backpropagating, we carry out the unrolled iterative network to converge to a fixed point to ensure the convergence. In case the measurement data contains noise, we prove that the proposed network is robust against noisy interference. Finally, numerical experiments show that the proposed network consistently outperforms the state-of-the-art MRI reconstruction methods including traditional regularization methods and other deep learning methods.

6.1IVDec 1, 2021

Total-Body Low-Dose CT Image Denoising using Prior Knowledge Transfer Technique with Contrastive Regularization Mechanism

Minghan Fu, Yanhua Duan, Zhaoping Cheng et al.

Reducing the radiation exposure for patients in Total-body CT scans has attracted extensive attention in the medical imaging community. Given the fact that low radiation dose may result in increased noise and artifacts, which greatly affected the clinical diagnosis. To obtain high-quality Total-body Low-dose CT (LDCT) images, previous deep-learning-based research work has introduced various network architectures. However, most of these methods only adopt Normal-dose CT (NDCT) images as ground truths to guide the training of the denoising network. Such simple restriction leads the model to less effectiveness and makes the reconstructed images suffer from over-smoothing effects. In this paper, we propose a novel intra-task knowledge transfer method that leverages the distilled knowledge from NDCT images to assist the training process on LDCT images. The derived architecture is referred to as the Teacher-Student Consistency Network (TSC-Net), which consists of the teacher network and the student network with identical architecture. Through the supervision between intermediate features, the student network is encouraged to imitate the teacher network and gain abundant texture details. Moreover, to further exploit the information contained in CT scans, a contrastive regularization mechanism (CRM) built upon contrastive learning is introduced.CRM performs to pull the restored CT images closer to the NDCT samples and push far away from the LDCT samples in the latent space. In addition, based on the attention and deformable convolution mechanism, we design a Dynamic Enhancement Module (DEM) to improve the network transformation capability.

15.8IVSep 7, 2021Code

MRI Reconstruction Using Deep Energy-Based Model

Yu Guan, Zongjiang Tu, Shanshan Wang et al.

Purpose: Although recent deep energy-based generative models (EBMs) have shown encouraging results in many image generation tasks, how to take advantage of the self-adversarial cogitation in deep EBMs to boost the performance of Magnetic Resonance Imaging (MRI) reconstruction is still desired. Methods: With the successful application of deep learning in a wide range of MRI reconstruction, a line of emerging research involves formulating an optimization-based reconstruction method in the space of a generative model. Leveraging this, a novel regularization strategy is introduced in this article which takes advantage of self-adversarial cogitation of the deep energy-based model. More precisely, we advocate for alternative learning a more powerful energy-based model with maximum likelihood estimation to obtain the deep energy-based information, represented as image prior. Simultaneously, implicit inference with Langevin dynamics is a unique property of re-construction. In contrast to other generative models for reconstruction, the proposed method utilizes deep energy-based information as the image prior in reconstruction to improve the quality of image. Results: Experiment results that imply the proposed technique can obtain remarkable performance in terms of high reconstruction accuracy that is competitive with state-of-the-art methods, and does not suffer from mode collapse. Conclusion: Algorithmically, an iterative approach was presented to strengthen EBM training with the gradient of energy network. The robustness and the reproducibility of the algorithm were also experimentally validated. More importantly, the proposed reconstruction framework can be generalized for most MRI reconstruction scenarios.

2.6CVApr 13, 2021

SRR-Net: A Super-Resolution-Involved Reconstruction Method for High Resolution MR Imaging

Wenqi Huang, Sen Jia, Ziwen Ke et al.

Improving the image resolution and acquisition speed of magnetic resonance imaging (MRI) is a challenging problem. There are mainly two strategies dealing with the speed-resolution trade-off: (1) $k$-space undersampling with high-resolution acquisition, and (2) a pipeline of lower resolution image reconstruction and image super-resolution. However, these approaches either have limited performance at certain high acceleration factor or suffer from the error accumulation of two-step structure. In this paper, we combine the idea of MR reconstruction and image super-resolution, and work on recovering HR images from low-resolution under-sampled $k$-space data directly. Particularly, the SR-involved reconstruction can be formulated as a variational problem, and a learnable network unrolled from its solution algorithm is proposed. A discriminator was introduced to enhance the detail refining performance. Experiment results using in-vivo HR multi-coil brain data indicate that the proposed SRR-Net is capable of recovering high-resolution brain images with both good visual quality and perceptual quality.

12.0IVMar 9, 2021Code

Deep Manifold Learning for Dynamic MR Imaging

Ziwen Ke, Zhuo-Xu Cui, Wenqi Huang et al.

Purpose: To develop a deep learning method on a nonlinear manifold to explore the temporal redundancy of dynamic signals to reconstruct cardiac MRI data from highly undersampled measurements. Methods: Cardiac MR image reconstruction is modeled as general compressed sensing (CS) based optimization on a low-rank tensor manifold. The nonlinear manifold is designed to characterize the temporal correlation of dynamic signals. Iterative procedures can be obtained by solving the optimization model on the manifold, including gradient calculation, projection of the gradient to tangent space, and retraction of the tangent space to the manifold. The iterative procedures on the manifold are unrolled to a neural network, dubbed as Manifold-Net. The Manifold-Net is trained using in vivo data with a retrospective electrocardiogram (ECG)-gated segmented bSSFP sequence. Results: Experimental results at high accelerations demonstrate that the proposed method can obtain improved reconstruction compared with a compressed sensing (CS) method k-t SLR and two state-of-the-art deep learning-based methods, DC-CNN and CRNN. Conclusion: This work represents the first study unrolling the optimization on manifolds into neural networks. Specifically, the designed low-rank manifold provides a new technical route for applying low-rank priors in dynamic MR imaging.

19.8IVOct 26, 2020Code

Deep Low-rank plus Sparse Network for Dynamic MR Imaging

Wenqi Huang, Ziwen Ke, Zhuo-Xu Cui et al.

In dynamic magnetic resonance (MR) imaging, low-rank plus sparse (L+S) decomposition, or robust principal component analysis (PCA), has achieved stunning performance. However, the selection of the parameters of L+S is empirical, and the acceleration rate is limited, which are common failings of iterative compressed sensing MR imaging (CS-MRI) reconstruction methods. Many deep learning approaches have been proposed to address these issues, but few of them use a low-rank prior. In this paper, a model-based low-rank plus sparse network, dubbed L+S-Net, is proposed for dynamic MR reconstruction. In particular, we use an alternating linearized minimization method to solve the optimization problem with low-rank and sparse regularization. Learned soft singular value thresholding is introduced to ensure the clear separation of the L component and S component. Then, the iterative steps are unrolled into a network in which the regularization parameters are learnable. We prove that the proposed L+S-Net achieves global convergence under two standard assumptions. Experiments on retrospective and prospective cardiac cine datasets show that the proposed model outperforms state-of-the-art CS and existing deep learning methods and has great potential for extremely high acceleration factors (up to 24x).

1.2CVSep 9, 2020

Is Each Layer Non-trivial in CNN?

Wei Wang, Yanjie Zhu, Zhuoxu Cui et al.

Convolutional neural network (CNN) models have achieved great success in many fields. With the advent of ResNet, networks used in practice are getting deeper and wider. However, is each layer non-trivial in networks? To answer this question, we trained a network on the training set, then we replace the network convolution kernels with zeros and test the result models on the test set. We compared experimental results with baseline and showed that we can reach similar or even the same performances. Although convolution kernels are the cores of networks, we demonstrate that some of them are trivial and regular in ResNet.

11.4IVAug 14, 2020Code

Homotopic Gradients of Generative Density Priors for MR Image Reconstruction

Cong Quan, Jinjie Zhou, Yuanzheng Zhu et al.

Deep learning, particularly the generative model, has demonstrated tremendous potential to significantly speed up image reconstruction with reduced measurements recently. Rather than the existing generative models that often optimize the density priors, in this work, by taking advantage of the denoising score matching, homotopic gradients of generative density priors (HGGDP) are proposed for magnetic resonance imaging (MRI) reconstruction. More precisely, to tackle the low-dimensional manifold and low data density region issues in generative density prior, we estimate the target gradients in higher-dimensional space. We train a more powerful noise conditional score network by forming high-dimensional tensor as the network input at the training phase. More artificial noise is also injected in the embedding space. At the reconstruction stage, a homotopy method is employed to pursue the density prior, such as to boost the reconstruction performance. Experiment results imply the remarkable performance of HGGDP in terms of high reconstruction accuracy; only 10% of the k-space data can still generate images of high quality as effectively as standard MRI reconstruction with the fully sampled data.

6.5IVJun 22, 2020

Deep Low-rank Prior in Dynamic MR Imaging

Ziwen Ke, Wenqi Huang, Jing Cheng et al.

The deep learning methods have achieved attractive performance in dynamic MR cine imaging. However, all of these methods are only driven by the sparse prior of MR images, while the important low-rank (LR) prior of dynamic MR cine images is not explored, which limits the further improvements on dynamic MR reconstruction. In this paper, a learned singular value thresholding (Learned-SVT) operation is proposed to explore deep low-rank prior in dynamic MR imaging for obtaining improved reconstruction results. In particular, we come up with two novel and distinct schemes to introduce the learnable low-rank prior into deep network architectures in an unrolling manner and a plug-and-play manner respectively. In the unrolling manner, we put forward a model-based unrolling sparse and low-rank network for dynamic MR imaging, dubbed SLR-Net. The SLR-Net is defined over a deep network flow graph, which is unrolled from the iterative procedures in the Iterative Shrinkage-Thresholding Algorithm (ISTA) for optimizing a sparse and low-rank based dynamic MRI model. In the plug-and-play manner, we present a plug-and-play LR network module that can be easily embedded into any other dynamic MR neural networks without changing the network paradigm. Experimental results show that both schemes can further improve the state-of-the-art CS methods, such as k-t SLR, and sparsity-driven deep learning-based methods, such as DC-CNN and CRNN, both qualitatively and quantitatively.

21.9IVAug 7, 2019

Model Learning: Primal Dual Networks for Fast MR imaging

Jing Cheng, Haifeng Wang, Leslie Ying et al.

Magnetic resonance imaging (MRI) is known to be a slow imaging modality and undersampling in k-space has been used to increase the imaging speed. However, image reconstruction from undersampled k-space data is an ill-posed inverse problem. Iterative algorithms based on compressed sensing have been used to address the issue. In this work, we unroll the iterations of the primal-dual hybrid gradient algorithm to a learnable deep network architecture, and gradually relax the constraints to reconstruct MR images from highly undersampled k-space data. The proposed method combines the theoretical convergence guarantee of optimi-zation methods with the powerful learning capability of deep networks. As the constraints are gradually relaxed, the reconstruction model is finally learned from the training data by updating in k-space and image domain alternatively. Experi-ments on in vivo MR data demonstrate that the proposed method achieves supe-rior MR reconstructions from highly undersampled k-space data over other state-of-the-art image reconstruction methods.

21.6IVJul 26, 2019

Deep MRI Reconstruction: Unrolled Optimization Algorithms Meet Neural Networks

Dong Liang, Jing Cheng, Ziwen Ke et al.

Image reconstruction from undersampled k-space data has been playing an important role for fast MRI. Recently, deep learning has demonstrated tremendous success in various fields and also shown potential to significantly speed up MR reconstruction with reduced measurements. This article gives an overview of deep learning-based image reconstruction methods for MRI. Three types of deep learning-based approaches are reviewed, the data-driven, model-driven and integrated approaches. The main structure of each network in three approaches is explained and the analysis of common parts of reviewed networks and differences in-between are highlighted. Based on the review, a number of signal processing issues are discussed for maximizing the potential of deep reconstruction for fast MRI. the discussion may facilitate further development of "optimal" network and performance analysis from a theoretical point of view.

6.5CVJun 19, 2019

Model-based Deep Medical Imaging: the roadmap of generalizing iterative reconstruction model using deep learning

Jing Cheng, Haifeng Wang, Yanjie Zhu et al.

Medical imaging is playing a more and more important role in clinics. However, there are several issues in different imaging modalities such as slow imaging speed in MRI, radiation injury in CT and PET. Therefore, accelerating MRI, reducing radiation dose in CT and PET have been ongoing research topics since their invention. Usually, acquiring less data is a direct but important strategy to address these issues. However, less acquisition usually results in aliasing artifacts in reconstructions. Recently, deep learning (DL) has been introduced in medical image reconstruction and shown potential on significantly speeding up MR reconstruction and reducing radiation dose. In this paper, we propose a general framework on combining the reconstruction model with deep learning to maximize the potential of deep learning and model-based reconstruction, and give the examples to demonstrate the performance and requirements of unrolling different algorithms using deep learning.

27.8IVJun 11, 2019Code

DeepcomplexMRI: Exploiting deep residual network for fast parallel MR imaging with complex convolution

Shanshan Wang, Huitao Cheng, Leslie Ying et al.

This paper proposes a multi-channel image reconstruction method, named DeepcomplexMRI, to accelerate parallel MR imaging with residual complex convolutional neural network. Different from most existing works which rely on the utilization of the coil sensitivities or prior information of predefined transforms, DeepcomplexMRI takes advantage of the availability of a large number of existing multi-channel groudtruth images and uses them as labeled data to train the deep residual convolutional neural network offline. In particular, a complex convolutional network is proposed to take into account the correlation between the real and imaginary parts of MR images. In addition, the k space data consistency is further enforced repeatedly in between layers of the network. The evaluations on in vivo datasets show that the proposed method has the capability to recover the desired multi-channel images. Its comparison with state-of-the-art method also demonstrates that the proposed method can reconstruct the desired MR images more accurately.

0.9CVJan 18, 2019

CRDN: Cascaded Residual Dense Networks for Dynamic MR Imaging with Edge-enhanced Loss Constraint

Ziwen Ke, Shanshan Wang, Huitao Cheng et al.

Dynamic magnetic resonance (MR) imaging has generated great research interest, as it can provide both spatial and temporal information for clinical diagnosis. However, slow imaging speed or long scanning time is still one of the challenges for dynamic MR imaging. Most existing methods reconstruct Dynamic MR images from incomplete k-space data under the guidance of compressed sensing (CS) or low rank theory, which suffer from long iterative reconstruction time. Recently, deep learning has shown great potential in accelerating dynamic MR. Our previous work proposed a dynamic MR imaging method with both k-space and spatial prior knowledge integrated via multi-supervised network training. Nevertheless, there was still a certain degree of smooth in the reconstructed images at high acceleration factors. In this work, we propose cascaded residual dense networks for dynamic MR imaging with edge-enhance loss constraint, dubbed as CRDN. Specifically, the cascaded residual dense networks fully exploit the hierarchical features from all the convolutional layers with both local and global feature fusion. We further utilize the total variation (TV) loss function, which has the edge enhancement properties, for training the networks.

15.6CVSep 30, 2018

DIMENSION: Dynamic MR Imaging with Both K-space and Spatial Prior Knowledge Obtained via Multi-Supervised Network Training

Shanshan Wang, Ziwen Ke, Huitao Cheng et al.

Dynamic MR image reconstruction from incomplete k-space data has generated great research interest due to its capability in reducing scan time. Nevertheless, the reconstruction problem is still challenging due to its ill-posed nature. Most existing methods either suffer from long iterative reconstruction time or explore limited prior knowledge. This paper proposes a dynamic MR imaging method with both k-space and spatial prior knowledge integrated via multi-supervised network training, dubbed as DIMENSION. Specifically, the DIMENSION architecture consists of a frequential prior network for updating the k-space with its network prediction and a spatial prior network for capturing image structures and details. Furthermore, a multisupervised network training technique is developed to constrain the frequency domain information and reconstruction results at different levels. The comparisons with classical k-t FOCUSS, k-t SLR, L+S and the state-of-the-art CNN-based method on in vivo datasets show our method can achieve improved reconstruction results in shorter time.