CVMay 24
Stop Denoising Your BlursSasidhar Parvathireddy, Vamsidhar Saraswathula, Rama Krishna Gorthi
In recent times, diffusion models have achieved remarkable performance in image restoration tasks. Their core mechanism relies on the restricted presumption of degradation prior to the additive noise operation. However, the blur model, one of the most widely studied degradation formulations, violates this assumption, as it is inherently based on convolution rather than addition. In this paper, we introduce ConvDiff, a novel diffusion based framework that substitutes the additive operation with convolution for the task of image deblurring. In the forward process, we construct a meaningful trajectory from the clean image to its blurred counterpart by exploiting the frequency domain characteristics of convolution, rather than progressively corrupting the image with additive noise. While the current work instantiates this framework for Gaussian blur, where frequency-domain decomposition yields closed-form and physically valid intermediate states, the underlying principle of constructing degradation trajectories from the blur operator extends naturally to other blur families. This formulation bridges the gap between the mathematical principles of blurring and the iterative design of diffusion-based restoration algorithms, enabling more physically grounded and effective image restoration models.
IVJul 6, 2024
Effective-LDAM: An Effective Loss Function To Mitigate Data Imbalance for Robust Chest X-Ray Disease ClassificationSree Rama Vamsidhar S, Bhargava Satya, Rama Krishna Gorthi
Deep Learning (DL) approaches have gained prominence in medical imaging for disease diagnosis. Chest X-ray (CXR) classification has emerged as an effective method for detecting various diseases. Among these methodologies, Chest X-ray (CXR) classification has proven to be an effective approach for detecting and analyzing various diseases. However, the reliable performance of DL classification algorithms is dependent upon access to large and balanced datasets, which pose challenges in medical imaging due to the impracticality of acquiring sufficient data for every disease category. To tackle this problem, we propose an algorithmic-centric approach called Effective-Label Distribution Aware Margin (E-LDAM), which modifies the margin of the widely adopted Label Distribution Aware Margin (LDAM) loss function using an effective number of samples in each class. Experimental evaluations on the COVIDx CXR dataset focus on Normal, Pneumonia, and COVID-19 classification. The experimental results demonstrate the effectiveness of the proposed E-LDAM approach, achieving a remarkable recall score of 97.81% for the minority class (COVID-19) in CXR image prediction. Furthermore, the overall accuracy of the three-class classification task attains an impressive level of 95.26%.
CVJul 5, 2024
D3: Deep Deconvolution Deblurring for Natural ImagesVamsidhar Saraswathula, Rama Krishna Gorthi
In this paper, we propose to reformulate the blind image deblurring task to directly learn an inverse of the degradation model represented by a deep linear network. We introduce Deep Identity Learning (DIL), a novel learning strategy that includes a dedicated regularization term based on the properties of linear systems, to exploit the identity relation between the degradation and inverse degradation models. The salient aspect of our proposed framework is it neither relies on a deblurring dataset nor a single input blurry image (e.g. Polyblur, a self-supervised method). This framework detours the typical degradation kernel estimation step involved in most of the existing blind deblurring solutions by the proposition of our Random Kernel Gallery (RKG) dataset. The proposed approach extends our previous Image Super-Resolution (ISR) work, NSSR-DIL, to the image deblurring task. In this work, we updated the regularization term in DIL based on Fourier transform properties of the identity relation, to deliver robust performance across a wide range of degradations. Besides the regularization term, we provide an explicit and compact representation of the learned deep linear network in a matrix form, called Deep Restoration Kernel (DRK) to perform image restoration. Our experiments show that the proposed method outperforms both traditional and deep learning based deblurring methods, with at least an order of 100 lesser computational resources. The D3 model, both LCNN & DRK, can be effortlessly extended to the Image Super-Resolution (ISR) task as well to restore the low-resolution images with fine details. The D3 model and its kernel form representation (DRK) are lightweight yet robust and restore the blurry input in a fraction of a second.
CVSep 17, 2024
NSSR-DIL: Null-Shot Image Super-Resolution Using Deep Identity LearningSree Rama Vamsidhar S, Rama Krishna Gorthi
The present State-of-the-Art (SotA) Image Super-Resolution (ISR) methods employ Deep Learning (DL) techniques using a large amount of image data. The primary limitation to extending the existing SotA ISR works for real-world instances is their computational and time complexities. In this paper, contrary to the existing methods, we present a novel and computationally efficient ISR algorithm that is independent of the image dataset to learn the ISR task. The proposed algorithm reformulates the ISR task from generating the Super-Resolved (SR) images to computing the inverse of the kernels that span the degradation space. We introduce Deep Identity Learning, exploiting the identity relation between the degradation and inverse degradation models. The proposed approach neither relies on the ISR dataset nor on a single input low-resolution (LR) image (like the self-supervised method i.e. ZSSR) to model the ISR task. Hence we term our model as Null-Shot Super-Resolution Using Deep Identity Learning (NSSR-DIL). The proposed NSSR-DIL model requires fewer computational resources, at least by an order of 10, and demonstrates a competitive performance on benchmark ISR datasets. Another salient aspect of our proposition is that the NSSR-DIL framework detours retraining the model and remains the same for varying scale factors like X2, X3, and X4. This makes our highly efficient ISR model more suitable for real-world applications.