IVJun 21, 2024Code
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology ImagesParastoo Sotoudeh Sharifi, M. Omair Ahmad, M. N. S. Swamy
Recent advancements in deep learning (DL) have significantly advanced medical image analysis. In the field of medical image processing, particularly in histopathology image analysis, the variation in staining protocols and differences in scanners present significant domain shift challenges, undermine the generalization capabilities of models to the data from unseen domains, prompting the need for effective domain generalization (DG) strategies to improve the consistency and reliability of automated cancer detection tools in diagnostic decision-making. In this paper, we introduce Pathology Weight Averaging (PathoWAve), a multi-source DG strategy for addressing domain shift phenomenon of DL models in histopathology image analysis. Integrating specific weight averaging technique with parallel training trajectories and a strategically combination of regular augmentations with histopathology-specific data augmentation methods, PathoWAve enables a comprehensive exploration and precise convergence within the loss landscape. This method significantly enhanced generalization capabilities of DL models across new, unseen histopathology domains. To the best of our knowledge, PathoWAve is the first proposed weight averaging method for DG in histopathology image analysis. Our quantitative results on Camelyon17 WILDS dataset demonstrate PathoWAve's superiority over previous proposed methods to tackle the domain shift phenomenon in histopathology image processing. Our code is available at \url{https://github.com/ParastooSotoudeh/PathoWAve}.
LGMar 3, 2025
ECG-EmotionNet: Nested Mixture of Expert (NMoE) Adaptation of ECG-Foundation Model for Driver Emotion RecognitionNastaran Mansourian, Arash Mohammadi, M. Omair Ahmad et al.
Driver emotion recognition plays a crucial role in driver monitoring systems, enhancing human-autonomy interactions and the trustworthiness of Autonomous Driving (AD). Various physiological and behavioural modalities have been explored for this purpose, with Electrocardiogram (ECG) emerging as a standout choice for real-time emotion monitoring, particularly in dynamic and unpredictable driving conditions. Existing methods, however, often rely on multi-channel ECG signals recorded under static conditions, limiting their applicability in real-world dynamic driving scenarios. To address this limitation, the paper introduces ECG-EmotionNet, a novel architecture designed specifically for emotion recognition in dynamic driving environments. ECG-EmotionNet is constructed by adapting a recently introduced ECG Foundation Model (FM) and uniquely employs single-channel ECG signals, ensuring both robust generalizability and computational efficiency. Unlike conventional adaptation methods such as full fine-tuning, linear probing, or low-rank adaptation, we propose an intuitively pleasing alternative, referred to as the nested Mixture of Experts (MoE) adaptation. More precisely, each transformer layer of the underlying FM is treated as a separate expert, with embeddings extracted from these experts fused using trainable weights within a gating mechanism. This approach enhances the representation of both global and local ECG features, leading to a 6% improvement in accuracy and a 7% increase in the F1 score, all while maintaining computational efficiency. The effectiveness of the proposed ECG-EmotionNet architecture is evaluated using a recently introduced and challenging driver emotion monitoring dataset.
IVJul 4, 2020
Multi-Site Infant Brain Segmentation Algorithms: The iSeg-2019 ChallengeYue Sun, Kun Gao, Zhengwang Wu et al.
To better understand early brain growth patterns in health and disorder, it is critical to accurately segment infant brain magnetic resonance (MR) images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). Deep learning-based methods have achieved state-of-the-art performance; however, one of major limitations is that the learning-based methods may suffer from the multi-site issue, that is, the models trained on a dataset from one site may not be applicable to the datasets acquired from other sites with different imaging protocols/scanners. To promote methodological development in the community, iSeg-2019 challenge (http://iseg2019.web.unc.edu) provides a set of 6-month infant subjects from multiple sites with different protocols/scanners for the participating methods. Training/validation subjects are from UNC (MAP) and testing subjects are from UNC/UMN (BCP), Stanford University, and Emory University. By the time of writing, there are 30 automatic segmentation methods participating in iSeg-2019. We review the 8 top-ranked teams by detailing their pipelines/implementations, presenting experimental results and evaluating performance in terms of the whole brain, regions of interest, and gyral landmark curves. We also discuss their limitations and possible future directions for the multi-site issue. We hope that the multi-site dataset in iSeg-2019 and this review article will attract more researchers on the multi-site issue.
CRJun 4, 2020
Design and Hardware Implementation of a Separable Image Steganographic Scheme Using Public-key CryptosystemSalah Harb, M. Omair Ahmad, M. N. S Swamy
In this paper, a novel and efficient hardware implementation of steganographic cryptosystem based on a public-key cryptography is proposed. Digital images are utilized as carriers of secret data between sender and receiver parties in the communication channel. The proposed public-key cryptosystem offers a separable framework that allows to embed or extract secret data and encrypt or decrypt the carrier using the public-private key pair, independently. Paillier cryptographic system is adopted to encrypt and decrypt pixels of the digital image. To achieve efficiency, a proposed efficient parallel montgomery exponentiation core is designed and implemented for performing the underlying field operations in the Paillier cryptosystem. The hardware implementation results of the proposed steganographic cryptosystem show an efficiency in terms of area (resources), performance (speed) and power consumption. Our steganographic cryptosystem represents a small footprint making it well-suited for the embedded systems and real-time processing engines in applications such as medical scanning devices, autopilot cars and drones.
CVMar 18, 2020
Applying r-spatiogram in object tracking for occlusion handlingNiloufar Salehi Dastjerdi, M. Omair Ahmad
Object tracking is one of the most important problems in computer vision. The aim of video tracking is to extract the trajectories of a target or object of interest, i.e. accurately locate a moving target in a video sequence and discriminate target from non-targets in the feature space of the sequence. So, feature descriptors can have significant effects on such discrimination. In this paper, we use the basic idea of many trackers which consists of three main components of the reference model, i.e., object modeling, object detection and localization, and model updating. However, there are major improvements in our system. Our forth component, occlusion handling, utilizes the r-spatiogram to detect the best target candidate. While spatiogram contains some moments upon the coordinates of the pixels, r-spatiogram computes region-based compactness on the distribution of the given feature in the image that captures richer features to represent the objects. The proposed research develops an efficient and robust way to keep tracking the object throughout video sequences in the presence of significant appearance variations and severe occlusions. The proposed method is evaluated on the Princeton RGBD tracking dataset considering sequences with different challenges and the obtained results demonstrate the effectiveness of the proposed method.
ASFeb 19, 2018
Speech Enhancement in Adverse Environments Based on Non-stationary Noise-driven Spectral Subtraction and SNR-dependent Phase CompensationMd Tauhidul Islam, Asaduzzaman, Celia Shahnaz et al.
A two-step enhancement method based on spectral subtraction and phase spectrum compensation is presented in this paper for noisy speeches in adverse environments involving non-stationary noise and medium to low levels of SNR. The magnitude of the noisy speech spectrum is modified in the first step of the proposed method by a spectral subtraction approach, where a new noise estimation method based on the low frequency information of the noisy speech is introduced. We argue that this method of noise estimation is capable of estimating the non-stationary noise accurately. The phase spectrum of the noisy speech is modified in the second step consisting of phase spectrum compensation, where an SNR-dependent approach is incorporated to determine the amount of compensation to be imposed on the phase spectrum. A modified complex spectrum is obtained by aggregating the magnitude from the spectral subtraction step and modified phase spectrum from the phase compensation step, which is found to be a better representation of enhanced speech spectrum. Speech files available in the NOIZEUS database are used to carry extensive simulations for evaluation of the proposed method.
ASFeb 13, 2018
Enhancement of Noisy Speech with Low Speech Distortion Based on Probabilistic Geometric Spectral SubtractionMd Tauhidul Islam, Celia Shahnaz, Wei-Ping Zhu et al.
A speech enhancement method based on probabilistic geometric approach to spectral subtraction (PGA) performed on short time magnitude spectrum is presented in this paper. A confidence parameter of noise estimation is introduced in the gain function of the proposed method to prevent subtraction of the overestimated and underestimated noise, which not only removes the noise efficiently but also prevents the speech distortion. The noise compensated magnitude spectrum is then recombined with the unchanged phase spectrum to produce a modified complex spectrum prior to synthesize an enhanced frame. Extensive simulations are carried out using the speech files available in the NOIZEUS database in order to evaluate the performance of the proposed method.
ASFeb 9, 2018
Modeling of Teager Energy Operated Perceptual Wavelet Packet Coefficients with an Erlang-2 PDF for Real Time Enhancement of Noisy SpeechMd Tauhidul Islam, Celia Shahnaz, Wei-Ping Zhu et al.
In this paper, for real time enhancement of noisy speech, a method of threshold determination based on modeling of Teager energy (TE) operated perceptual wavelet packet (PWP) coefficients of the noisy speech and noise by an Erlang-2 PDF is presented. The proposed method is computationally much faster than the existing wavelet packet based thresholding methods. A custom thresholding function based on a combination of mu-law and semisoft thresholding functions is designed and exploited to apply the statistically derived threshold upon the PWP coefficients. The proposed custom thresholding function works as a mu-law or a semisoft thresholding function or their combination based on the probability of speech presence and absence in a subband of the PWP transformed noisy speech. By using the speech files available in NOIZEUS database, a number of simulations are performed to evaluate the performance of the proposed method for speech signals in the presence of Gaussian white and street noises. The proposed method outperforms some of the state-of-the-art speech enhancement methods both at high and low levels of SNRs in terms of standard objective measures and subjective evaluations including formal listening tests.
ASFeb 7, 2018
A Divide and Conquer Strategy for Musical Noise-free Speech Enhancement in Adverse EnvironmentsMd Tauhidul Islam, Celia Shahnaz, Wei-Ping Zhu et al.
A divide and conquer strategy for enhancement of noisy speeches in adverse environments involving lower levels of SNR is presented in this paper, where the total system of speech enhancement is divided into two separate steps. The first step is based on noise compensation on short time magnitude and the second step is based on phase compensation. The magnitude spectrum is compensated based on a modified spectral subtraction method where the cross-terms containing spectra of noise and clean speech are taken into consideration, which are neglected in the traditional spectral subtraction methods. By employing the modified magnitude and unchanged phase, a procedure is formulated to compensate the overestimation or underestimation of noise by phase compensation method based on the probability of speech presence. A modified complex spectrum based on these two steps are obtained to synthesize a musical noise free enhanced speech. Extensive simulations are carried out using the speech files available in the NOIZEUS database in order to evaluate the performance of the proposed method. It is shown in terms of the objective measures, spectrogram analysis and formal subjective listening tests that the proposed method consistently outperforms some of the state-of-the-art methods of speech enhancement for noisy speech corrupted by street or babble noise at very low as well as medium levels of SNR.