SPMay 23, 2022
BolT: Fused Window Transformers for fMRI Time Series AnalysisHasan Atakan Bedel, Irmak Şıvgın, Onat Dalmaz et al.
Deep-learning models have enabled performance leaps in analysis of high-dimensional functional MRI (fMRI) data. Yet, many previous methods are suboptimally sensitive for contextual representations across diverse time scales. Here, we present BolT, a blood-oxygen-level-dependent transformer model, for analyzing multi-variate fMRI time series. BolT leverages a cascade of transformer encoders equipped with a novel fused window attention mechanism. Encoding is performed on temporally-overlapped windows within the time series to capture local representations. To integrate information temporally, cross-window attention is computed between base tokens in each window and fringe tokens from neighboring windows. To gradually transition from local to global representations, the extent of window overlap and thereby number of fringe tokens are progressively increased across the cascade. Finally, a novel cross-window regularization is employed to align high-level classification features across the time series. Comprehensive experiments on large-scale public datasets demonstrate the superior performance of BolT against state-of-the-art methods. Furthermore, explanatory analyses to identify landmark time points and regions that contribute most significantly to model decisions corroborate prominent neuroscientific findings in the literature.
IVJul 13, 2022
One Model to Unite Them All: Personalized Federated Learning of Multi-Contrast MRI SynthesisOnat Dalmaz, Usama Mirza, Gökberk Elmas et al.
Multi-institutional collaborations are key for learning generalizable MRI synthesis models that translate source- onto target-contrast images. To facilitate collaboration, federated learning (FL) adopts decentralized training and mitigates privacy concerns by avoiding sharing of imaging data. However, FL-trained synthesis models can be impaired by the inherent heterogeneity in the data distribution, with domain shifts evident when common or variable translation tasks are prescribed across sites. Here we introduce the first personalized FL method for MRI Synthesis (pFLSynth) to improve reliability against domain shifts. pFLSynth is based on an adversarial model that produces latents specific to individual sites and source-target contrasts, and leverages novel personalization blocks to adaptively tune the statistics and weighting of feature maps across the generator stages given latents. To further promote site specificity, partial model aggregation is employed over downstream layers of the generator while upstream layers are retained locally. As such, pFLSynth enables training of a unified synthesis model that can reliably generalize across multiple sites and translation tasks. Comprehensive experiments on multi-site datasets clearly demonstrate the enhanced performance of pFLSynth against prior federated methods in multi-contrast MRI synthesis.
SPJul 18, 2023
DreaMR: Diffusion-driven Counterfactual Explanation for Functional MRIHasan Atakan Bedel, Tolga Çukur
Deep learning analyses have offered sensitivity leaps in detection of cognitive states from functional MRI (fMRI) measurements across the brain. Yet, as deep models perform hierarchical nonlinear transformations on their input, interpreting the association between brain responses and cognitive states is challenging. Among common explanation approaches for deep fMRI classifiers, attribution methods show poor specificity and perturbation methods show limited plausibility. While counterfactual generation promises to address these limitations, previous methods use variational or adversarial priors that yield suboptimal sample fidelity. Here, we introduce the first diffusion-driven counterfactual method, DreaMR, to enable fMRI interpretation with high specificity, plausibility and fidelity. DreaMR performs diffusion-based resampling of an input fMRI sample to alter the decision of a downstream classifier, and then computes the minimal difference between the original and counterfactual samples for explanation. Unlike conventional diffusion methods, DreaMR leverages a novel fractional multi-phase-distilled diffusion prior to improve sampling efficiency without compromising fidelity, and it employs a transformer architecture to account for long-range spatiotemporal context in fMRI scans. Comprehensive experiments on neuroimaging datasets demonstrate the superior specificity, fidelity and efficiency of DreaMR in sample generation over state-of-the-art counterfactual methods for fMRI interpretation.
IVJul 17, 2022
Unsupervised Medical Image Translation with Adversarial Diffusion ModelsMuzaffer Özbey, Onat Dalmaz, Salman UH Dar et al.
Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols. A pervasive approach for synthesizing target images involves one-shot mapping through generative adversarial networks (GAN). Yet, GAN models that implicitly characterize the image distribution can suffer from limited sample fidelity. Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation. To capture a direct correlate of the image distribution, SynDiff leverages a conditional diffusion process that progressively maps noise and source images onto the target image. For fast and accurate image sampling during inference, large diffusion steps are taken with adversarial projections in the reverse diffusion direction. To enable training on unpaired datasets, a cycle-consistent architecture is devised with coupled diffusive and non-diffusive modules that bilaterally translate between two modalities. Extensive assessments are reported on the utility of SynDiff against competing GAN and diffusion models in multi-contrast MRI and MRI-CT translation. Our demonstrations indicate that SynDiff offers quantitatively and qualitatively superior performance against competing baselines.
IVJul 12, 2022
Adaptive Diffusion Priors for Accelerated MRI ReconstructionAlper Güngör, Salman UH Dar, Şaban Öztürk et al.
Deep MRI reconstruction is commonly performed with conditional models that de-alias undersampled acquisitions to recover images consistent with fully-sampled data. Since conditional models are trained with knowledge of the imaging operator, they can show poor generalization across variable operators. Unconditional models instead learn generative image priors decoupled from the operator to improve reliability against domain shifts related to the imaging operator. Recent diffusion models are particularly promising given their high sample fidelity. Nevertheless, inference with a static image prior can perform suboptimally. Here we propose the first adaptive diffusion prior for MRI reconstruction, AdaDiff, to improve performance and reliability against domain shifts. AdaDiff leverages an efficient diffusion prior trained via adversarial mapping over large reverse diffusion steps. A two-phase reconstruction is executed following training: a rapid-diffusion phase that produces an initial reconstruction with the trained prior, and an adaptation phase that further refines the result by updating the prior to minimize data-consistency loss. Demonstrations on multi-contrast brain MRI clearly indicate that AdaDiff outperforms competing conditional and unconditional methods under domain shifts, and achieves superior or on par within-domain performance.
IVMar 18, 2023
FD-Net: An Unsupervised Deep Forward-Distortion Model for Susceptibility Artifact Correction in EPIAbdallah Zaid Alkilani, Tolga Çukur, Emine Ulku Saritas
Recent learning-based correction approaches in EPI estimate a displacement field, unwarp the reversed-PE image pair with the estimated field, and average the unwarped pair to yield a corrected image. Unsupervised learning in these unwarping-based methods is commonly attained via a similarity constraint between the unwarped images in reversed-PE directions, neglecting consistency to the acquired EPI images. This work introduces an unsupervised deep-learning method for fast and effective correction of susceptibility artifacts in reversed phase-encode (PE) image pairs acquired with EPI. FD-Net predicts both the susceptibility-induced displacement field and the underlying anatomically-correct image. Unlike previous methods, FD-Net enforces the forward-distortions of the correct image in both PE directions to be consistent with the acquired reversed-PE image pair. FD-Net further leverages a multiresolution architecture to maintain high local and global performance. FD-Net performs competitively with a gold-standard reference method (TOPUP) in image quality, while enabling a leap in computational efficiency. Furthermore, FD-Net outperforms recent unwarping-based methods for unsupervised correction in terms of both image and field quality. The unsupervised FD-Net method introduces a deep forward-distortion approach to enable fast, high-fidelity correction of susceptibility artifacts in EPI by maintaining consistency to measured data. Therefore, it holds great promise for improving the anatomical accuracy of EPI imaging.
CLSep 1, 2022
Unsupervised Simplification of Legal TextsMert Cemri, Tolga Çukur, Aykut Koç
The processing of legal texts has been developing as an emerging field in natural language processing (NLP). Legal texts contain unique jargon and complex linguistic attributes in vocabulary, semantics, syntax, and morphology. Therefore, the development of text simplification (TS) methods specific to the legal domain is of paramount importance for facilitating comprehension of legal text by ordinary people and providing inputs to high-level models for mainstream legal NLP applications. While a recent study proposed a rule-based TS method for legal text, learning-based TS in the legal domain has not been considered previously. Here we introduce an unsupervised simplification method for legal texts (USLT). USLT performs domain-specific TS by replacing complex words and splitting long sentences. To this end, USLT detects complex words in a sentence, generates candidates via a masked-transformer model, and selects a candidate for substitution based on a rank score. Afterward, USLT recursively decomposes long sentences into a hierarchy of shorter core and context sentences while preserving semantic meaning. We demonstrate that USLT outperforms state-of-the-art domain-general TS methods in text simplicity while keeping the semantics intact.
IVOct 9, 2023
HydraViT: Adaptive Multi-Branch Transformer for Multi-Label Disease Classification from Chest X-ray ImagesŞaban Öztürk, M. Yiğit Turalı, Tolga Çukur
Chest X-ray is an essential diagnostic tool in the identification of chest diseases given its high sensitivity to pathological abnormalities in the lungs. However, image-driven diagnosis is still challenging due to heterogeneity in size and location of pathology, as well as visual similarities and co-occurrence of separate pathology. Since disease-related regions often occupy a relatively small portion of diagnostic images, classification models based on traditional convolutional neural networks (CNNs) are adversely affected given their locality bias. While CNNs were previously augmented with attention maps or spatial masks to guide focus on potentially critical regions, learning localization guidance under heterogeneity in the spatial distribution of pathology is challenging. To improve multi-label classification performance, here we propose a novel method, HydraViT, that synergistically combines a transformer backbone with a multi-branch output module with learned weighting. The transformer backbone enhances sensitivity to long-range context in X-ray images, while using the self-attention mechanism to adaptively focus on task-critical regions. The multi-branch output module dedicates an independent branch to each disease label to attain robust learning across separate disease classes, along with an aggregated branch across labels to maintain sensitivity to co-occurrence relationships among pathology. Experiments demonstrate that, on average, HydraViT outperforms competing attention-guided methods by 1.2%, region-guided methods by 1.4%, and semantic-guided methods by 1.0% in multi-label classification performance.
IVSep 19, 2024
DenoMamba: A fused state-space model for low-dose CT denoisingŞaban Öztürk, Oğuz Can Duran, Tolga Çukur
Low-dose computed tomography (LDCT) lower potential risks linked to radiation exposure while relying on advanced denoising algorithms to maintain diagnostic quality in reconstructed images. The reigning paradigm in LDCT denoising is based on neural network models that learn data-driven image priors to separate noise evoked by dose reduction from underlying tissue signals. Naturally, the fidelity of these priors depend on the model's ability to capture the broad range of contextual features evident in CT images. Earlier convolutional neural networks (CNN) are highly adept at efficiently capturing short-range spatial context, but their limited receptive fields reduce sensitivity to interactions over longer distances. Although transformers based on self-attention mechanisms have recently been posed to increase sensitivity to long-range context, they can suffer from suboptimal performance and efficiency due to elevated model complexity, particularly for high-resolution CT images. For high-quality restoration of LDCT images, here we introduce DenoMamba, a novel denoising method based on state-space modeling (SSM), that efficiently captures short- and long-range context in medical images. Following an hourglass architecture with encoder-decoder stages, DenoMamba employs a spatial SSM module to encode spatial context and a novel channel SSM module equipped with a secondary gated convolution network to encode latent features of channel context at each stage. Feature maps from the two modules are then consolidated with low-level input features via a convolution fusion module (CFM). Comprehensive experiments on LDCT datasets with 25\% and 10\% dose reduction demonstrate that DenoMamba outperforms state-of-the-art denoisers with average improvements of 1.4dB PSNR, 1.1% SSIM, and 1.6% RMSE in recovered image quality.
IVSep 20, 2023
CalibFPA: A Focal Plane Array Imaging System based on Online Deep-Learning CalibrationAlper Güngör, M. Umut Bahceci, Yasin Ergen et al.
Compressive focal plane arrays (FPA) enable cost-effective high-resolution (HR) imaging by acquisition of several multiplexed measurements on a low-resolution (LR) sensor. Multiplexed encoding of the visual scene is typically performed via electronically controllable spatial light modulators (SLM). An HR image is then reconstructed from the encoded measurements by solving an inverse problem that involves the forward model of the imaging system. To capture system non-idealities such as optical aberrations, a mainstream approach is to conduct an offline calibration scan to measure the system response for a point source at each spatial location on the imaging grid. However, it is challenging to run calibration scans when using structured SLMs as they cannot encode individual grid locations. In this study, we propose a novel compressive FPA system based on online deep-learning calibration of multiplexed LR measurements (CalibFPA). We introduce a piezo-stage that locomotes a pre-printed fixed coded aperture. A deep neural network is then leveraged to correct for the influences of system non-idealities in multiplexed measurements without the need for offline calibration scans. Finally, a deep plug-and-play algorithm is used to reconstruct images from corrected measurements. On simulated and experimental datasets, we demonstrate that CalibFPA outperforms state-of-the-art compressive FPA methods. We also report analyses to validate the design elements in CalibFPA and assess computational complexity.
CVMay 4, 2025Code
Efficient Noise Calculation in Deep Learning-based MRI ReconstructionsOnat Dalmaz, Arjun D. Desai, Reinhard Heckel et al.
Accelerated MRI reconstruction involves solving an ill-posed inverse problem where noise in acquired data propagates to the reconstructed images. Noise analyses are central to MRI reconstruction for providing an explicit measure of solution fidelity and for guiding the design and deployment of novel reconstruction methods. However, deep learning (DL)-based reconstruction methods have often overlooked noise propagation due to inherent analytical and computational challenges, despite its critical importance. This work proposes a theoretically grounded, memory-efficient technique to calculate voxel-wise variance for quantifying uncertainty due to acquisition noise in accelerated MRI reconstructions. Our approach approximates noise covariance using the DL network's Jacobian, which is intractable to calculate. To circumvent this, we derive an unbiased estimator for the diagonal of this covariance matrix (voxel-wise variance) and introduce a Jacobian sketching technique to efficiently implement it. We evaluate our method on knee and brain MRI datasets for both data- and physics-driven networks trained in supervised and unsupervised manners. Compared to empirical references obtained via Monte Carlo simulations, our technique achieves near-equivalent performance while reducing computational and memory demands by an order of magnitude or more. Furthermore, our method is robust across varying input noise levels, acceleration factors, and diverse undersampling schemes, highlighting its broad applicability. Our work reintroduces accurate and efficient noise analysis as a central tenet of reconstruction algorithms, holding promise to reshape how we evaluate and deploy DL-based MRI. Our code will be made publicly available upon acceptance.
IVMay 22, 2024
I2I-Mamba: Multi-modal medical image synthesis via selective state space modelingOmer F. Atli, Bilal Kabas, Fuat Arslan et al.
Multi-modal medical image synthesis involves nonlinear transformation of tissue signals between source and target modalities, where tissues exhibit contextual interactions across diverse spatial distances. As such, the utility of a network architecture in synthesis depends on its ability to express the broad set of contextual features in medical images. Convolutional neural networks (CNNs) offer high local precision at the expense of poor sensitivity to long-range context. While transformers promise to alleviate this issue, they suffer from an unfavorable trade-off between sensitivity to long- versus short-range context due to the intrinsic complexity of attention filters. To effectively capture contextual features while avoiding the complexitydriven trade-offs, here we introduce a novel multi-modal synthesis method, I2I-Mamba, based on the state space modeling (SSM) framework. Focusing on high-level representations across a hybrid residual architecture, I2I-Mamba leverages novel dual-domain Mamba (ddMamba) blocks for complementary contextual modeling in image and Fourier domains, while maintaining spatial precision with convolutional layers. Diverting from conventional raster-scan trajectories, ddMamba leverages novel SSM operators based on a spiral-scan trajectory to learn context with enhanced angular isotropy and radial coverage, and a channel-mixing layer to aggregate context across the channel dimension. Comprehensive demonstrations on multi-contrast MRI and MRI-CT protocols indicate that I2I-Mamba outperforms state-of-the-art CNNs, transformers and SSMs.
IVMay 10, 2024
Self-Consistent Recursive Diffusion Bridge for Medical Image TranslationFuat Arslan, Bilal Kabas, Onat Dalmaz et al.
Denoising diffusion models (DDM) have gained recent traction in medical image translation given improved training stability over adversarial models. DDMs learn a multi-step denoising transformation to progressively map random Gaussian-noise images onto target-modality images, while receiving stationary guidance from source-modality images. As this denoising transformation diverges significantly from the task-relevant source-to-target transformation, DDMs can suffer from weak source-modality guidance. Here, we propose a novel self-consistent recursive diffusion bridge (SelfRDB) for improved performance in medical image translation. Unlike DDMs, SelfRDB employs a novel forward process with start- and end-points defined based on target and source images, respectively. Intermediate image samples across the process are expressed via a normal distribution with mean taken as a convex combination of start-end points, and variance from additive noise. Unlike regular diffusion bridges that prescribe zero variance at start-end points and high variance at mid-point of the process, we propose a novel noise scheduling with monotonically increasing variance towards the end-point in order to boost generalization performance and facilitate information transfer between the two modalities. To further enhance sampling accuracy in each reverse step, we propose a novel sampling procedure where the network recursively generates a transient-estimate of the target image until convergence onto a self-consistent solution. Comprehensive analyses in multi-contrast MRI and MRI-CT translation indicate that SelfRDB offers superior performance against competing methods.
IVDec 12, 2024
Physics-Driven Autoregressive State Space Models for Medical Image ReconstructionBilal Kabas, Fuat Arslan, Valiyeh A. Nezhad et al.
Medical image reconstruction from undersampled acquisitions is an ill-posed inverse problem requiring accurate recovery of anatomical structures from incomplete measurements. Physics-driven (PD) network models have gained prominence for this task by integrating data-consistency mechanisms with learned priors, enabling improved performance over purely data-driven approaches. However, reconstruction quality still hinges on the network's ability to disentangle artifacts from true anatomical signals-both of which exhibit complex, multi-scale contextual structure. Convolutional neural networks (CNNs) capture local correlations but often struggle with non-local dependencies. While transformers aim to alleviate this limitation, practical implementations involve design compromises to reduce computational cost by balancing local and non-local sensitivity, occasionally resulting in performance comparable to CNNs. To address these challenges, we propose MambaRoll, a novel physics-driven autoregressive state space model (SSM) for high-fidelity and efficient image reconstruction. MambaRoll employs an unrolled architecture where each cascade autoregressively predicts finer-scale feature maps conditioned on coarser-scale representations, enabling consistent multi-scale context propagation. Each stage is built on a hierarchy of scale-specific PD-SSM modules that capture spatial dependencies while enforcing data consistency through residual correction. To further improve scale-aware learning, we introduce a Deep Multi-Scale Decoding (DMSD) loss, which provides supervision at intermediate spatial scales in alignment with the autoregressive design. Demonstrations on accelerated MRI and sparse-view CT reconstructions show that MambaRoll consistently outperforms state-of-the-art CNN-, transformer-, and SSM-based methods.
IVFeb 6, 2025
Generative Autoregressive Transformers for Model-Agnostic Federated MRI ReconstructionValiyeh A. Nezhad, Gokberk Elmas, Bilal Kabas et al.
While learning-based models hold great promise for MRI reconstruction, single-site models trained on limited local datasets often show poor generalization. This has motivated collaborative training across institutions via federated learning (FL)-a privacy-preserving framework that aggregates model updates instead of sharing raw data. Conventional FL requires architectural homogeneity, restricting sites from using models tailored to their resources or needs. To address this limitation, we propose FedGAT, a model-agnostic FL technique that first collaboratively trains a global generative prior for MR images, adapted from a natural image foundation model composed of a variational autoencoder (VAE) and a transformer that generates images via spatial-scale autoregression. We fine-tune the transformer module after injecting it with a lightweight site-specific prompting mechanism, keeping the VAE frozen, to efficiently adapt the model to multi-site MRI data. In a second tier, each site independently trains its preferred reconstruction model by augmenting local data with synthetic MRI data from other sites, generated by site-prompting the tuned prior. This decentralized augmentation improves generalization while preserving privacy. Experiments on multi-institutional datasets show that FedGAT outperforms state-of-the-art FL baselines in both within- and cross-site reconstruction performance under model-heterogeneous settings.
CVApr 22, 2025
Meta-Entity Driven Triplet Mining for Aligning Medical Vision-Language ModelsSaban Ozturk, Melih B. Yilmaz, Muti Kara et al.
Diagnostic imaging relies on interpreting both images and radiology reports, but the growing data volumes place significant pressure on medical experts, yielding increased errors and workflow backlogs. Medical vision-language models (med-VLMs) have emerged as a powerful framework to efficiently process multimodal imaging data, particularly in chest X-ray (CXR) evaluations, albeit their performance hinges on how well image and text representations are aligned. Existing alignment methods, predominantly based on contrastive learning, prioritize separation between disease classes over segregation of fine-grained pathology attributes like location, size or severity, leading to suboptimal representations. Here, we propose MedTrim (Meta-entity-driven Triplet mining), a novel method that enhances image-text alignment through multimodal triplet learning synergistically guided by disease class as well as adjectival and directional pathology descriptors. Unlike common alignment methods that separate broad disease classes, MedTrim leverages structured meta-entity information to preserve subtle but clinically significant intra-class variations. For this purpose, we first introduce an ontology-based entity recognition module that extracts pathology-specific meta-entities from CXR reports, as annotations on pathology attributes are rare in public datasets. For refined sample selection in triplet mining, we then introduce a novel score function that captures an aggregate measure of inter-sample similarity based on disease classes and adjectival/directional descriptors. Lastly, we introduce a multimodal triplet alignment objective for explicit within- and cross-modal alignment between samples sharing detailed pathology characteristics. Our demonstrations indicate that MedTrim improves performance in downstream retrieval and classification tasks compared to state-of-the-art alignment methods.
CVJun 14, 2024
Bayesian Conditioned Diffusion Models for Inverse ProblemsAlper Güngör, Bahri Batuhan Bilecen, Tolga Çukur
Diffusion models have recently been shown to excel in many image reconstruction tasks that involve inverse problems based on a forward measurement operator. A common framework uses task-agnostic unconditional models that are later post-conditioned for reconstruction, an approach that typically suffers from suboptimal task performance. While task-specific conditional models have also been proposed, current methods heuristically inject measured data as a naive input channel that elicits sampling inaccuracies. Here, we address the optimal conditioning of diffusion models for solving challenging inverse problems that arise during image reconstruction. Specifically, we propose a novel Bayesian conditioning technique for diffusion models, BCDM, based on score-functions associated with the conditional distribution of desired images given measured data. We rigorously derive the theory to express and train the conditional score-function. Finally, we show state-of-the-art performance in image dealiasing, deblurring, super-resolution, and inpainting with the proposed technique.
IVFeb 8, 2022
Federated Learning of Generative Image Priors for MRI ReconstructionGokberk Elmas, Salman UH Dar, Yilmaz Korkmaz et al.
Multi-institutional efforts can facilitate training of deep MRI reconstruction models, albeit privacy risks arise during cross-site sharing of imaging data. Federated learning (FL) has recently been introduced to address privacy concerns by enabling distributed training without transfer of imaging data. Existing FL methods for MRI reconstruction employ conditional models to map from undersampled to fully-sampled acquisitions via explicit knowledge of the imaging operator. Since conditional models generalize poorly across different acceleration rates or sampling densities, imaging operators must be fixed between training and testing, and they are typically matched across sites. To improve generalization and flexibility in multi-institutional collaborations, here we introduce a novel method for MRI reconstruction based on Federated learning of Generative IMage Priors (FedGIMP). FedGIMP leverages a two-stage approach: cross-site learning of a generative MRI prior, and subject-specific injection of the imaging operator. The global MRI prior is learned via an unconditional adversarial model that synthesizes high-quality MR images based on latent variables. Specificity in the prior is preserved via a mapper subnetwork that produces site-specific latents. During inference, the prior is combined with subject-specific imaging operators to enable reconstruction, and further adapted to individual test samples by minimizing data-consistency loss. Comprehensive experiments on multi-institutional datasets clearly demonstrate enhanced generalization performance of FedGIMP against site-specific and federated methods based on conditional models, as well as traditional reconstruction methods.
IVJun 30, 2021
ResViT: Residual vision transformers for multi-modal medical image synthesisOnat Dalmaz, Mahmut Yurt, Tolga Çukur
Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning.} ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant information. A weight sharing strategy is introduced among ART blocks to mitigate computational burden. A unified implementation is introduced to avoid the need to rebuild separate synthesis models for varying source-target modality configurations. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing CNN- and transformer-based methods in terms of qualitative observations and quantitative metrics.
IVMay 15, 2021
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial TransformersYilmaz Korkmaz, Salman UH Dar, Mahmut Yurt et al.
Supervised reconstruction models are characteristically trained on matched pairs of undersampled and fully-sampled data to capture an MRI prior, along with supervision regarding the imaging operator to enforce data consistency. To reduce supervision requirements, the recent deep image prior framework instead conjoins untrained MRI priors with the imaging operator during inference. Yet, canonical convolutional architectures are suboptimal in capturing long-range relationships, and priors based on randomly initialized networks may yield suboptimal performance. To address these limitations, here we introduce a novel unsupervised MRI reconstruction method based on zero-Shot Learned Adversarial TransformERs (SLATER). SLATER embodies a deep adversarial network with cross-attention transformers to map noise and latent variables onto coil-combined MR images. During pre-training, this unconditional network learns a high-quality MRI prior in an unsupervised generative modeling task. During inference, a zero-shot reconstruction is then performed by incorporating the imaging operator and optimizing the prior to maximize consistency to undersampled data. Comprehensive experiments on brain MRI datasets clearly demonstrate the superior performance of SLATER against state-of-the-art unsupervised methods.
CVMar 13, 2021
A Few-Shot Learning Approach for Accelerated MRI via Fusion of Data-Driven and Subject-Driven PriorsSalman Ul Hassan Dar, Mahmut Yurt, Tolga Çukur
Deep neural networks (DNNs) have recently found emerging use in accelerated MRI reconstruction. DNNs typically learn data-driven priors from large datasets constituting pairs of undersampled and fully-sampled acquisitions. Acquiring such large datasets, however, might be impractical. To mitigate this limitation, we propose a few-shot learning approach for accelerated MRI that merges subject-driven priors obtained via physical signal models with data-driven priors obtained from a few training samples. Demonstrations on brain MR images from the NYU fastMRI dataset indicate that the proposed approach requires just a few samples to outperform traditional parallel imaging and DNN algorithms.
IVDec 18, 2020
Three Dimensional MR Image Synthesis with Progressive Generative Adversarial NetworksMuzaffer Özbey, Mahmut Yurt, Salman Ul Hassan Dar et al.
Mainstream deep models for three-dimensional MRI synthesis are either cross-sectional or volumetric depending on the input. Cross-sectional models can decrease the model complexity, but they may lead to discontinuity artifacts. On the other hand, volumetric models can alleviate the discontinuity artifacts, but they might suffer from loss of spatial resolution due to increased model complexity coupled with scarce training data. To mitigate the limitations of both approaches, we propose a novel model that progressively recovers the target volume via simpler synthesis tasks across individual orientations.
IVNov 29, 2020
Semi-Supervised Learning of Mutually Accelerated MRI Synthesis without Fully-Sampled Ground TruthsMahmut Yurt, Salman Ul Hassan Dar, Muzaffer Özbey et al.
Learning-based synthetic multi-contrast MRI commonly involves deep models trained using high-quality images of source and target contrasts, regardless of whether source and target domain samples are paired or unpaired. This results in undesirable reliance on fully-sampled acquisitions of all MRI contrasts, which might prove impractical due to limitations on scan costs and time. Here, we propose a novel semi-supervised deep generative model that instead learns to recover high-quality target images directly from accelerated acquisitions of source and target contrasts. To achieve this, the proposed model introduces novel multi-coil tensor losses in image, k-space and adversarial domains. These selective losses are based only on acquired k-space samples, and randomized sampling masks are used across subjects to capture relationships among acquired and non-acquired k-space regions. Comprehensive experiments on multi-contrast neuroimaging datasets demonstrate that our semi-supervised approach yields equivalent performance to gold-standard fully-supervised models, while outperforming a cascaded approach that learns to synthesize based on reconstructions of undersampled data. Therefore, the proposed approach holds great promise to improve the feasibility and utility of accelerated MRI acquisitions mutually undersampled across both contrast sets and k-space.
CVNov 27, 2020
Progressively Volumetrized Deep Generative Models for Data-Efficient Contextual Learning of MR Image RecoveryMahmut Yurt, Muzaffer Özbey, Salman Ul Hassan Dar et al.
Magnetic resonance imaging (MRI) offers the flexibility to image a given anatomic volume under a multitude of tissue contrasts. Yet, scan time considerations put stringent limits on the quality and diversity of MRI data. The gold-standard approach to alleviate this limitation is to recover high-quality images from data undersampled across various dimensions, most commonly the Fourier domain or contrast sets. A primary distinction among recovery methods is whether the anatomy is processed per volume or per cross-section. Volumetric models offer enhanced capture of global contextual information, but they can suffer from suboptimal learning due to elevated model complexity. Cross-sectional models with lower complexity offer improved learning behavior, yet they ignore contextual information across the longitudinal dimension of the volume. Here, we introduce a novel progressive volumetrization strategy for generative models (ProvoGAN) that serially decomposes complex volumetric image recovery tasks into successive cross-sectional mappings task-optimally ordered across individual rectilinear dimensions. ProvoGAN effectively captures global context and recovers fine-structural details across all dimensions, while maintaining low model complexity and improved learning behaviour. Comprehensive demonstrations on mainstream MRI reconstruction and synthesis tasks show that ProvoGAN yields superior performance to state-of-the-art volumetric and cross-sectional models.
IVSep 25, 2019
mustGAN: Multi-Stream Generative Adversarial Networks for MR Image SynthesisMahmut Yurt, Salman Ul Hassan Dar, Aykut Erdem et al.
Multi-contrast MRI protocols increase the level of morphological information available for diagnosis. Yet, the number and quality of contrasts is limited in practice by various factors including scan time and patient motion. Synthesis of missing or corrupted contrasts can alleviate this limitation to improve clinical utility. Common approaches for multi-contrast MRI involve either one-to-one and many-to-one synthesis methods. One-to-one methods take as input a single source contrast, and they learn a latent representation sensitive to unique features of the source. Meanwhile, many-to-one methods receive multiple distinct sources, and they learn a shared latent representation more sensitive to common features across sources. For enhanced image synthesis, here we propose a multi-stream approach that aggregates information across multiple source images via a mixture of multiple one-to-one streams and a joint many-to-one stream. The shared feature maps generated in the many-to-one stream and the complementary feature maps generated in the one-to-one streams are combined with a fusion block. The location of the fusion block is adaptively modified to maximize task-specific performance. Qualitative and quantitative assessments on T1-, T2-, PD-weighted and FLAIR images clearly demonstrate the superior performance of the proposed method compared to previous state-of-the-art one-to-one and many-to-one methods.
IVFeb 1, 2019
Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRIThomas Sanchez, Baran Gözcü, Ruud B. van Heeswijk et al.
Compressed sensing applied to magnetic resonance imaging (MRI) allows to reduce the scanning time by enabling images to be reconstructed from highly undersampled data. In this paper, we tackle the problem of designing a sampling mask for an arbitrary reconstruction method and a limited acquisition budget. Namely, we look for an optimal probability distribution from which a mask with a fixed cardinality is drawn. We demonstrate that this problem admits a compactly supported solution, which leads to a deterministic optimal sampling mask. We then propose a stochastic greedy algorithm that (i) provides an approximate solution to this problem, and (ii) resolves the scaling issues of [1,2]. We validate its performance on in vivo dynamic MRI with retrospective undersampling, showing that our method preserves the performance of [1,2] while reducing the computational burden by a factor close to 200.
CVMay 27, 2018
Synergistic Reconstruction and Synthesis via Generative Adversarial Networks for Accelerated Multi-Contrast MRISalman Ul Hassan Dar, Mahmut Yurt, Mohammad Shahdloo et al.
Multi-contrast MRI acquisitions of an anatomy enrich the magnitude of information available for diagnosis. Yet, excessive scan times associated with additional contrasts may be a limiting factor. Two mainstream approaches for enhanced scan efficiency are reconstruction of undersampled acquisitions and synthesis of missing acquisitions. In reconstruction, performance decreases towards higher acceleration factors with diminished sampling density particularly at high-spatial-frequencies. In synthesis, the absence of data samples from the target contrast can lead to artefactual sensitivity or insensitivity to image features. Here we propose a new approach for synergistic reconstruction-synthesis of multi-contrast MRI based on conditional generative adversarial networks. The proposed method preserves high-frequency details of the target contrast by relying on the shared high-frequency information available from the source contrast, and prevents feature leakage or loss by relying on the undersampled acquisitions of the target contrast. Demonstrations on brain MRI datasets from healthy subjects and patients indicate the superior performance of the proposed method compared to previous state-of-the-art. The proposed method can help improve the quality and scan efficiency of multi-contrast MRI exams.
IVMay 3, 2018
Learning-Based Compressive MRIBaran Gözcü, Rabeeh Karimi Mahabadi, Yen-Huan Li et al.
In the area of magnetic resonance imaging (MRI), an extensive range of non-linear reconstruction algorithms have been proposed that can be used with general Fourier subsampling patterns. However, the design of these subsampling patterns has typically been considered in isolation from the reconstruction rule and the anatomy under consideration. In this paper, we propose a learning-based framework for optimizing MRI subsampling patterns for a specific reconstruction rule and anatomy, considering both the noiseless and noisy settings. Our learning algorithm has access to a representative set of training signals, and searches for a sampling pattern that performs well on average for the signals in this set. We present a novel parameter-free greedy mask selection method, and show it to be effective for a variety of reconstruction rules and performance metrics. Moreover we also support our numerical findings by providing a rigorous justification of our framework via statistical learning theory.
CVFeb 5, 2018
Image Synthesis in Multi-Contrast MRI with Conditional Generative Adversarial NetworksSalman Ul Hassan Dar, Mahmut Yurt, Levent Karacan et al.
Acquiring images of the same anatomy with multiple different contrasts increases the diversity of diagnostic information available in an MR exam. Yet, scan time limitations may prohibit acquisition of certain contrasts, and images for some contrast may be corrupted by noise and artifacts. In such cases, the ability to synthesize unacquired or corrupted contrasts from remaining contrasts can improve diagnostic utility. For multi-contrast synthesis, current methods learn a nonlinear intensity transformation between the source and target images, either via nonlinear regression or deterministic neural networks. These methods can in turn suffer from loss of high-spatial-frequency information in synthesized images. Here we propose a new approach for multi-contrast MRI synthesis based on conditional generative adversarial networks. The proposed approach preserves high-frequency details via an adversarial loss; and it offers enhanced synthesis performance via a pixel-wise loss for registered multi-contrast images and a cycle-consistency loss for unregistered images. Information from neighboring cross-sections are utilized to further improved synthesis quality. Demonstrations on T1- and T2-weighted images from healthy subjects and patients clearly indicate the superior performance of the proposed approach compared to previous state-of-the-art methods. Our synthesis approach can help improve quality and versatility of multi-contrast MRI exams without the need for prolonged examinations.
CVOct 7, 2017
A Transfer-Learning Approach for Accelerated MRI using Deep Neural NetworksSalman Ul Hassan Dar, Muzaffer Özbey, Ahmet Burak Çatlı et al.
Purpose: Neural networks have received recent interest for reconstruction of undersampled MR acquisitions. Ideally network performance should be optimized by drawing the training and testing data from the same domain. In practice, however, large datasets comprising hundreds of subjects scanned under a common protocol are rare. The goal of this study is to introduce a transfer-learning approach to address the problem of data scarcity in training deep networks for accelerated MRI. Methods: Neural networks were trained on thousands of samples from public datasets of either natural images or brain MR images. The networks were then fine-tuned using only few tens of brain MR images in a distinct testing domain. Domain-transferred networks were compared to networks trained directly in the testing domain. Network performance was evaluated for varying acceleration factors (2-10), number of training samples (0.5-4k) and number of fine-tuning samples (0-100). Results: The proposed approach achieves successful domain transfer between MR images acquired with different contrasts (T1- and T2-weighted images), and between natural and MR images (ImageNet and T1- or T2-weighted images). Networks obtained via transfer-learning using only tens of images in the testing domain achieve nearly identical performance to networks trained directly in the testing domain using thousands of images. Conclusion: The proposed approach might facilitate the use of neural networks for MRI reconstruction without the need for collection of extensive imaging datasets.