CVApr 29, 2023Code
LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral Image Generation with Variance RegularizationEmmanuel Martinez, Roman Jacome, Alejandra Hernandez-Rojas et al.
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks. However, these methods are constrained in their performance since available datasets are limited due to the highly expensive and long acquisition time. Usually, data augmentation techniques are employed to mitigate the lack of data. Surpassing classical augmentation methods, such as geometric transformations, GANs enable diverse augmentation by learning and sampling from the data distribution. Nevertheless, GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation. To surmount this limitation, we propose low-dimensional GAN (LD-GAN), where we train the GAN employing a low-dimensional representation of the {dataset} with the latent space of a pretrained autoencoder network. Thus, we generate new low-dimensional samples which are then mapped to the SI dimension with the pretrained decoder network. Besides, we propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN. We validate our method LD-GAN as data augmentation strategy for compressive spectral imaging, SI super-resolution, and RBG to spectral tasks with improvements varying from 0.5 to 1 [dB] in each task respectively. We perform comparisons against the non-data augmentation training, traditional DA, and with the same GAN adjusted and trained to generate the full-sized SIs. The code of this paper can be found in https://github.com/romanjacome99/LD_GAN.git
IVMay 24, 2022
D$^\text{2}$UF: Deep Coded Aperture Design and Unrolling Algorithm for Compressive Spectral Image FusionRoman Jacome, Jorge Bacca, Henry Arguello
Compressive spectral imaging (CSI) has attracted significant attention since it employs synthetic apertures to codify spatial and spectral information, sensing only 2D projections of the 3D spectral image. However, these optical architectures suffer from a trade-off between the spatial and spectral resolution of the reconstructed image due to technology limitations. To overcome this issue, compressive spectral image fusion (CSIF) employs the projected measurements of two CSI architectures with different resolutions to estimate a high-spatial high-spectral resolution. This work presents the fusion of the compressive measurements of a low-spatial high-spectral resolution coded aperture snapshot spectral imager (CASSI) architecture and a high-spatial low-spectral resolution multispectral color filter array (MCFA) system. Unlike previous CSIF works, this paper proposes joint optimization of the sensing architectures and a reconstruction network in an end-to-end (E2E) manner. The trainable optical parameters are the coded aperture (CA) in the CASSI and the colored coded aperture in the MCFA system, employing a sigmoid activation function and regularization function to encourage binary values on the trainable variables for an implementation purpose. Additionally, an unrolling-based network inspired by the alternating direction method of multipliers (ADMM) optimization is formulated to address the reconstruction step and the acquisition systems design jointly. Finally, a spatial-spectral inspired loss function is employed at the end of each unrolling layer to increase the convergence of the unrolling network. The proposed method outperforms previous CSIF methods, and experimental results validate the method with real measurements.
IVMay 14
DIPA: Distilled Preconditioned Algorithms for Solving Imaging Inverse ProblemsRomario Gualdrón-Hurtado, Roman Jacome, Leon Suarez et al.
Solving imaging inverse problems has usually been addressed by designing proper prior models of the underlying signal. However, minimizing the data fidelity term poses significant challenges due to the ill-conditioned sensing matrix caused by physical constraints in the acquisition system. Thus, preconditioning techniques have been adopted in classical optimization theory to address ill-conditioned data-fidelity minimization by transforming the algorithm gradient step to achieve faster convergence and better numerical stability. We extend the preconditioning concept beyond convergence acceleration and use it to improve reconstruction quality. We introduce DIPA: Distilled Preconditioned Algorithms, where a preconditioning operator (PO) is optimized using teacher-guided distillation criteria. Unlike standard model-compression KD, the teacher and student differ by the sensing operators available during reconstruction: the teacher uses a simulated, better-conditioned, and more informative sensing matrix, whereas the student uses the physically feasible sensing matrix. We design different distillation loss functions to transfer different properties of the teacher algorithm to the preconditioned student. The PO can be linear (L-DIPA), allowing interpretability, or non-linear (N-DIPA), parametrized by a neural network, offering better scalability. We validate the proposed PO design across several imaging modalities, including magnetic resonance imaging, compressed sensing, and super-resolution imaging.
CVFeb 23
GSNR: Graph Smooth Null-Space Representation for Inverse ProblemsRomario Gualdrón-Hurtado, Roman Jacome, Rafael S. Suarez et al.
Inverse problems in imaging are ill-posed, leading to infinitely many solutions consistent with the measurements due to the non-trivial null-space of the sensing matrix. Common image priors promote solutions on the general image manifold, such as sparsity, smoothness, or score function. However, as these priors do not constrain the null-space component, they can bias the reconstruction. Thus, we aim to incorporate meaningful null-space information in the reconstruction framework. Inspired by smooth image representation on graphs, we propose Graph-Smooth Null-Space Representation (GSNR), a mechanism that imposes structure only into the invisible component. Particularly, given a graph Laplacian, we construct a null-restricted Laplacian that encodes similarity between neighboring pixels in the null-space signal, and we design a low-dimensional projection matrix from the $p$-smoothest spectral graph modes (lowest graph frequencies). This approach has strong theoretical and practical implications: i) improved convergence via a null-only graph regularizer, ii) better coverage, how much null-space variance is captured by $p$ modes, and iii) high predictability, how well these modes can be inferred from the measurements. GSNR is incorporated into well-known inverse problem solvers, e.g., PnP, DIP, and diffusion solvers, in four scenarios: image deblurring, compressed sensing, demosaicing, and image super-resolution, providing consistent improvement of up to 4.3 dB over baseline formulations and up to 1 dB compared with end-to-end learned models in terms of PSNR.
IVJan 29, 2025
Distilling Knowledge for Designing Computational Imaging SystemsLeon Suarez-Rodriguez, Roman Jacome, Henry Arguello
Designing the physical encoder is crucial for accurate image reconstruction in computational imaging (CI) systems. Currently, these systems are designed via end-to-end (E2E) optimization, where the encoder is modeled as a neural network layer and is jointly optimized with the decoder. However, the performance of E2E optimization is significantly reduced by the physical constraints imposed on the encoder. Also, since the E2E learns the parameters of the encoder by backpropagating the reconstruction error, it does not promote optimal intermediate outputs and suffers from gradient vanishing. To address these limitations, we reinterpret the concept of knowledge distillation (KD) for designing a physically constrained CI system by transferring the knowledge of a pretrained, less-constrained CI system. Our approach involves three steps: (1) Given the original CI system (student), a teacher system is created by relaxing the constraints on the student's encoder. (2) The teacher is optimized to solve a less-constrained version of the student's problem. (3) The teacher guides the training of the student through two proposed knowledge transfer functions, targeting both the encoder and the decoder feature space. The proposed method can be employed to any imaging modality since the relaxation scheme and the loss functions can be adapted according to the physical acquisition and the employed decoder. This approach was validated on three representative CI modalities: magnetic resonance, single-pixel, and compressive spectral imaging. Simulations show that a teacher system with an encoder that has a structure similar to that of the student encoder provides effective guidance. Our approach achieves significantly improved reconstruction performance and encoder design, outperforming both E2E optimization and traditional non-data-driven encoder designs.
CVOct 2, 2025
NPN: Non-Linear Projections of the Null-Space for Imaging Inverse ProblemsRoman Jacome, Romario Gualdrón-Hurtado, Leon Suarez et al.
Imaging inverse problems aim to recover high-dimensional signals from undersampled, noisy measurements, a fundamentally ill-posed task with infinite solutions in the null-space of the sensing operator. To resolve this ambiguity, prior information is typically incorporated through handcrafted regularizers or learned models that constrain the solution space. However, these priors typically ignore the task-specific structure of that null-space. In this work, we propose Non-Linear Projections of the Null-Space (NPN), a novel class of regularization that, instead of enforcing structural constraints in the image domain, promotes solutions that lie in a low-dimensional projection of the sensing matrix's null-space with a neural network. Our approach has two key advantages: (1) Interpretability: by focusing on the structure of the null-space, we design sensing-matrix-specific priors that capture information orthogonal to the signal components that are fundamentally blind to the sensing process. (2) Flexibility: NPN is adaptable to various inverse problems, compatible with existing reconstruction frameworks, and complementary to conventional image-domain priors. We provide theoretical guarantees on convergence and reconstruction accuracy when used within plug-and-play methods. Empirical results across diverse sensing matrices demonstrate that NPN priors consistently enhance reconstruction fidelity in various imaging inverse problems, such as compressive sensing, deblurring, super-resolution, computed tomography, and magnetic resonance imaging, with plug-and-play methods, unrolling networks, deep image prior, and diffusion models.
CVSep 18, 2025
DICE: Diffusion Consensus Equilibrium for Sparse-view CT ReconstructionLeon Suarez-Rodriguez, Roman Jacome, Romario Gualdron-Hurtado et al.
Sparse-view computed tomography (CT) reconstruction is fundamentally challenging due to undersampling, leading to an ill-posed inverse problem. Traditional iterative methods incorporate handcrafted or learned priors to regularize the solution but struggle to capture the complex structures present in medical images. In contrast, diffusion models (DMs) have recently emerged as powerful generative priors that can accurately model complex image distributions. In this work, we introduce Diffusion Consensus Equilibrium (DICE), a framework that integrates a two-agent consensus equilibrium into the sampling process of a DM. DICE alternates between: (i) a data-consistency agent, implemented through a proximal operator enforcing measurement consistency, and (ii) a prior agent, realized by a DM performing a clean image estimation at each sampling step. By balancing these two complementary agents iteratively, DICE effectively combines strong generative prior capabilities with measurement consistency. Experimental results show that DICE significantly outperforms state-of-the-art baselines in reconstructing high-quality CT images under uniform and non-uniform sparse-view settings of 15, 30, and 60 views (out of a total of 180), demonstrating both its effectiveness and robustness.
CVJun 25, 2024
Highly Constrained Coded Aperture Imaging Systems Design Via a Knowledge Distillation ApproachLeon Suarez-Rodriguez, Roman Jacome, Henry Arguello
Computational optical imaging (COI) systems have enabled the acquisition of high-dimensional signals through optical coding elements (OCEs). OCEs encode the high-dimensional signal in one or more snapshots, which are subsequently decoded using computational algorithms. Currently, COI systems are optimized through an end-to-end (E2E) approach, where the OCEs are modeled as a layer of a neural network and the remaining layers perform a specific imaging task. However, the performance of COI systems optimized through E2E is limited by the physical constraints imposed by these systems. This paper proposes a knowledge distillation (KD) framework for the design of highly physically constrained COI systems. This approach employs the KD methodology, which consists of a teacher-student relationship, where a high-performance, unconstrained COI system (the teacher), guides the optimization of a physically constrained system (the student) characterized by a limited number of snapshots. We validate the proposed approach, using a binary coded apertures single pixel camera for monochromatic and multispectral image reconstruction. Simulation results demonstrate the superiority of the KD scheme over traditional E2E optimization for the designing of highly physically constrained COI systems.