IVMar 11, 2025Code
Deep Perceptual Enhancement for Medical Image AnalysisS M A Sharif, Rizwan Ali Naqvi, Mithun Biswas et al.
Due to numerous hardware shortcomings, medical image acquisition devices are susceptible to producing low-quality (i.e., low contrast, inappropriate brightness, noisy, etc.) images. Regrettably, perceptually degraded images directly impact the diagnosis process and make the decision-making manoeuvre of medical practitioners notably complicated. This study proposes to enhance such low-quality images by incorporating end-to-end learning strategies for accelerating medical image analysis tasks. To the best concern, this is the first work in medical imaging which comprehensively tackles perceptual enhancement, including contrast correction, luminance correction, denoising, etc., with a fully convolutional deep network. The proposed network leverages residual blocks and a residual gating mechanism for diminishing visual artefacts and is guided by a multi-term objective function to perceive the perceptually plausible enhanced images. The practicability of the deep medical image enhancement method has been extensively investigated with sophisticated experiments. The experimental outcomes illustrate that the proposed method could outperform the existing enhancement methods for different medical image modalities by 5.00 to 7.00 dB in peak signal-to-noise ratio (PSNR) metrics and 4.00 to 6.00 in DeltaE metrics. Additionally, the proposed method can drastically improve the medical image analysis tasks' performance and reveal the potentiality of such an enhancement method in real-world applications. Code Available: https://github.com/sharif-apu/DPE_JBHI
IVOct 16, 2021Code
SAGAN: Adversarial Spatial-asymmetric Attention for Noisy Nona-Bayer ReconstructionS M A Sharif, Rizwan Ali Naqvi, Mithun Biswas
Nona-Bayer colour filter array (CFA) pattern is considered one of the most viable alternatives to traditional Bayer patterns. Despite the substantial advantages, such non-Bayer CFA patterns are susceptible to produce visual artefacts while reconstructing RGB images from noisy sensor data. This study addresses the challenges of learning RGB image reconstruction from noisy Nona-Bayer CFA comprehensively. We propose a novel spatial-asymmetric attention module to jointly learn bi-direction transformation and large-kernel global attention to reduce the visual artefacts. We combine our proposed module with adversarial learning to produce plausible images from Nona-Bayer CFA. The feasibility of the proposed method has been verified and compared with the state-of-the-art image reconstruction method. The experiments reveal that the proposed method can reconstruct RGB images from noisy Nona-Bayer CFA without producing any visually disturbing artefacts. Also, it can outperform the state-of-the-art image reconstruction method in both qualitative and quantitative comparison. Code available: https://github.com/sharif-apu/SAGAN_BMVC21.
CVMar 4, 2025
DarkDeblur: Learning single-shot image deblurring in low-light conditionS M A Sharif, Rizwan Ali Naqvi, Farman Alic et al.
Single-shot image deblurring in a low-light condition is known to be a profoundly challenging image translation task. This study tackles the limitations of the low-light image deblurring with a learning-based approach and proposes a novel deep network named as DarkDeblurNet. The proposed DarkDeblur- Net comprises a dense-attention block and a contextual gating mechanism in a feature pyramid structure to leverage content awareness. The model additionally incorporates a multi-term objective function to perceive a plausible perceptual image quality while performing image deblurring in the low-light settings. The practicability of the proposed model has been verified by fusing it in numerous computer vision applications. Apart from that, this study introduces a benchmark dataset collected with actual hardware to assess the low-light image deblurring methods in a real-world setup. The experimental results illustrate that the proposed method can outperform the state-of-the-art methods in both synthesized and real-world data for single-shot image deblurring, even in challenging lighting environment.
IVMar 10, 2025
Two-stage Deep Denoising with Self-guided Noise Attention for Multimodal Medical ImagesS M A Sharif, Rizwan Ali Naqvi, Woong-Kee Loh
Medical image denoising is considered among the most challenging vision tasks. Despite the real-world implications, existing denoising methods have notable drawbacks as they often generate visual artifacts when applied to heterogeneous medical images. This study addresses the limitation of the contemporary denoising methods with an artificial intelligence (AI)-driven two-stage learning strategy. The proposed method learns to estimate the residual noise from the noisy images. Later, it incorporates a novel noise attention mechanism to correlate estimated residual noise with noisy inputs to perform denoising in a course-to-refine manner. This study also proposes to leverage a multi-modal learning strategy to generalize the denoising among medical image modalities and multiple noise patterns for widespread applications. The practicability of the proposed method has been evaluated with dense experiments. The experimental results demonstrated that the proposed method achieved state-of-the-art performance by significantly outperforming the existing medical image denoising methods in quantitative and qualitative comparisons. Overall, it illustrates a performance gain of 7.64 in Peak Signal-to-Noise Ratio (PSNR), 0.1021 in Structural Similarity Index (SSIM), 0.80 in DeltaE ($ΔE$), 0.1855 in Visual Information Fidelity Pixel-wise (VIFP), and 18.54 in Mean Squared Error (MSE) metrics.
CVMar 10, 2025
Illuminating Darkness: Learning to Enhance Low-light Images In-the-WildS M A Sharif, Abdur Rehman, Zain Ul Abidin et al.
Single-shot low-light image enhancement (SLLIE) remains challenging due to the limited availability of diverse, real-world paired datasets. To bridge this gap, we introduce the Low-Light Smartphone Dataset (LSD), a large-scale, high-resolution (4K+) dataset collected in the wild across a wide range of challenging lighting conditions (0.1 to 200 lux). LSD contains 6,425 precisely aligned low and normal-light image pairs, selected from over 8,000 dynamic indoor and outdoor scenes through multi-frame acquisition and expert evaluation. To evaluate generalization and aesthetic quality, we collect 2,117 unpaired low-light images from previously unseen devices. To fully exploit LSD, we propose TFFormer, a hybrid model that encodes luminance and chrominance (LC) separately to reduce color-structure entanglement. We further propose a cross-attention-driven joint decoder for context-aware fusion of LC representations, along with LC refinement and LC-guided supervision to significantly enhance perceptual fidelity and structural consistency. TFFormer achieves state-of-the-art results on LSD (+2.45 dB PSNR) and substantially improves downstream vision tasks, such as low-light object detection (+6.80 mAP on ExDark).
CVSep 25, 2025
Punching Above Precision: Small Quantized Model Distillation with Learnable RegularizerAbdur Rehman, S M A Sharif, Md Abdur Rahaman et al.
Quantization-aware training (QAT) combined with knowledge distillation (KD) is a promising strategy for compressing Artificial Intelligence (AI) models for deployment on resource-constrained hardware. However, existing QAT-KD methods often struggle to balance task-specific (TS) and distillation losses due to heterogeneous gradient magnitudes, especially under low-bit quantization. We propose Game of Regularizer (GoR), a novel learnable regularization method that adaptively balances TS and KD objectives using only two trainable parameters for dynamic loss weighting. GoR reduces conflict between supervision signals, improves convergence, and boosts the performance of small quantized models (SQMs). Experiments on image classification, object detection (OD), and large language model (LLM) compression show that GoR consistently outperforms state-of-the-art QAT-KD methods. On low-power edge devices, it delivers faster inference while maintaining full-precision accuracy. We also introduce QAT-EKD-GoR, an ensemble distillation framework that uses multiple heterogeneous teacher models. Under optimal conditions, the proposed EKD-GoR can outperform full-precision models, providing a robust solution for real-world deployment.
CVSep 22, 2025
Degradation-Aware All-in-One Image Restoration via Latent Prior EncodingS M A Sharif, Abdur Rehman, Fayaz Ali Dharejo et al.
Real-world images often suffer from spatially diverse degradations such as haze, rain, snow, and low-light, significantly impacting visual quality and downstream vision tasks. Existing all-in-one restoration (AIR) approaches either depend on external text prompts or embed hand-crafted architectural priors (e.g., frequency heuristics); both impose discrete, brittle assumptions that weaken generalization to unseen or mixed degradations. To address this limitation, we propose to reframe AIR as learned latent prior inference, where degradation-aware representations are automatically inferred from the input without explicit task cues. Based on latent priors, we formulate AIR as a structured reasoning paradigm: (1) which features to route (adaptive feature selection), (2) where to restore (spatial localization), and (3) what to restore (degradation semantics). We design a lightweight decoding module that efficiently leverages these latent encoded cues for spatially-adaptive restoration. Extensive experiments across six common degradation tasks, five compound settings, and previously unseen degradations demonstrate that our method outperforms state-of-the-art (SOTA) approaches, achieving an average PSNR improvement of 1.68 dB while being three times more efficient.