Karim Armanious

CV
18papers
1,098citations
Novelty44%
AI Score29

18 Papers

CVJun 24, 2024
Bosch Street Dataset: A Multi-Modal Dataset with Imaging Radar for Automated Driving

Karim Armanious, Maurice Quach, Michael Ulrich et al.

This paper introduces the Bosch street dataset (BSD), a novel multi-modal large-scale dataset aimed at promoting highly automated driving (HAD) and advanced driver-assistance systems (ADAS) research. Unlike existing datasets, BSD offers a unique integration of high-resolution imaging radar, lidar, and camera sensors, providing unprecedented 360-degree coverage to bridge the current gap in high-resolution radar data availability. Spanning urban, rural, and highway environments, BSD enables detailed exploration into radar-based object detection and sensor fusion techniques. The dataset is aimed at facilitating academic and research collaborations between Bosch and current and future partners. This aims to foster joint efforts in developing cutting-edge HAD and ADAS technologies. The paper describes the dataset's key attributes, including its scalability, radar resolution, and labeling methodology. Key offerings also include initial benchmarks for sensor modalities and a development kit tailored for extensive data analysis and performance evaluation, underscoring our commitment to contributing valuable resources to the HAD and ADAS research community.

CVMay 16, 2023
Wavelet-based Unsupervised Label-to-Image Translation

George Eskandar, Mohamed Abdelsamad, Karim Armanious et al.

Semantic Image Synthesis (SIS) is a subclass of image-to-image translation where a semantic layout is used to generate a photorealistic image. State-of-the-art conditional Generative Adversarial Networks (GANs) need a huge amount of paired data to accomplish this task while generic unpaired image-to-image translation frameworks underperform in comparison, because they color-code semantic layouts and learn correspondences in appearance instead of semantic content. Starting from the assumption that a high quality generated image should be segmented back to its semantic layout, we propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination. Furthermore, in order to match the high-frequency distribution of real images, a novel generator architecture in the wavelet domain is proposed. We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.

CVSep 29, 2021
USIS: Unsupervised Semantic Image Synthesis

George Eskandar, Mohamed Abdelsamad, Karim Armanious et al.

Semantic Image Synthesis (SIS) is a subclass of image-to-image translation where a photorealistic image is synthesized from a segmentation mask. SIS has mostly been addressed as a supervised problem. However, state-of-the-art methods depend on a huge amount of labeled data and cannot be applied in an unpaired setting. On the other hand, generic unpaired image-to-image translation frameworks underperform in comparison, because they color-code semantic layouts and feed them to traditional convolutional networks, which then learn correspondences in appearance instead of semantic content. In this initial work, we propose a new Unsupervised paradigm for Semantic Image Synthesis (USIS) as a first step towards closing the performance gap between paired and unpaired settings. Notably, the framework deploys a SPADE generator that learns to output images with visually separable semantic classes using a self-supervised segmentation loss. Furthermore, in order to match the color and texture distribution of real images without losing high-frequency information, we propose to use whole image wavelet-based discrimination. We test our methodology on 3 challenging datasets and demonstrate its ability to generate multimodal photorealistic images with an improved quality in the unpaired setting.

CVMar 15, 2021
Uncertainty-Based Biological Age Estimation of Brain MRI Scans

Karim Armanious, Sherif Abdulatif, Wenbin Shi et al.

Age is an essential factor in modern diagnostic procedures. However, assessment of the true biological age (BA) remains a daunting task due to the lack of reference ground-truth labels. Current BA estimation approaches are either restricted to skeletal images or rely on non-imaging modalities that yield a whole-body BA assessment. However, various organ systems may exhibit different aging characteristics due to lifestyle and genetic factors. In this initial study, we propose a new framework for organ-specific BA estimation utilizing 3D magnetic resonance image (MRI) scans. As a first step, this framework predicts the chronological age (CA) together with the corresponding patient-dependent aleatoric uncertainty. An iterative training algorithm is then utilized to segregate atypical aging patients from the given population based on the predicted uncertainty scores. In this manner, we hypothesize that training a new model on the remaining population should approximate the true BA behavior. We apply the proposed methodology on a brain MRI dataset containing healthy individuals as well as Alzheimer's patients. We demonstrate the correlation between the predicted BAs and the expected cognitive deterioration in Alzheimer's patients.

CVFeb 19, 2021
SLPC: a VRNN-based approach for stochastic lidar prediction and completion in autonomous driving

George Eskandar, Alexander Braun, Martin Meinke et al.

Predicting future 3D LiDAR pointclouds is a challenging task that is useful in many applications in autonomous driving such as trajectory prediction, pose forecasting and decision making. In this work, we propose a new LiDAR prediction framework that is based on generative models namely Variational Recurrent Neural Networks (VRNNs), titled Stochastic LiDAR Prediction and Completion (SLPC). Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames. Our contributions can thus be summarized as follows: we introduce the new task of predicting and completing depth maps from spatially sparse data, we present a sparse version of VRNNs and an effective self-supervised training method that does not require any labels. Experimental results illustrate the effectiveness of our framework in comparison to the state of the art methods in video prediction.

SDOct 20, 2020
Investigating Cross-Domain Losses for Speech Enhancement

Sherif Abdulatif, Karim Armanious, Jayasankar T. Sajeev et al.

Recent years have seen a surge in the number of available frameworks for speech enhancement (SE) and recognition. Whether model-based or constructed via deep learning, these frameworks often rely in isolation on either time-domain signals or time-frequency (TF) representations of speech data. In this study, we investigate the advantages of each set of approaches by separately examining their impact on speech intelligibility and quality. Furthermore, we combine the fragmented benefits of time-domain and TF speech representations by introducing two new cross-domain SE frameworks. A quantitative comparative analysis against recent model-based and deep learning SE approaches is performed to illustrate the merit of the proposed frameworks.

IVSep 22, 2020
Age-Net: An MRI-Based Iterative Framework for Brain Biological Age Estimation

Karim Armanious, Sherif Abdulatif, Wenbin Shi et al.

The concept of biological age (BA), although important in clinical practice, is hard to grasp mainly due to the lack of a clearly defined reference standard. For specific applications, especially in pediatrics, medical image data are used for BA estimation in a routine clinical context. Beyond this young age group, BA estimation is mostly restricted to whole-body assessment using non-imaging indicators such as blood biomarkers, genetic and cellular data. However, various organ systems may exhibit different aging characteristics due to lifestyle and genetic factors. Thus, a whole-body assessment of the BA does not reflect the deviations of aging behavior between organs. To this end, we propose a new imaging-based framework for organ-specific BA estimation. In this initial study, we focus mainly on brain MRI. As a first step, we introduce a chronological age (CA) estimation framework using deep convolutional neural networks (Age-Net). We quantitatively assess the performance of this framework in comparison to existing state-of-the-art CA estimation approaches. Furthermore, we expand upon Age-Net with a novel iterative data-cleaning algorithm to segregate atypical-aging patients (BA $\not \approx$ CA) from the given population. We hypothesize that the remaining population should approximate the true BA behavior. We apply the proposed methodology on a brain magnetic resonance image (MRI) dataset containing healthy individuals as well as Alzheimer's patients with different dementia ratings. We demonstrate the correlation between the predicted BAs and the expected cognitive deterioration in Alzheimer's patients. A statistical and visualization-based analysis has provided evidence regarding the potential and current challenges of the proposed methodology.

ASOct 21, 2019
AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks

Sherif Abdulatif, Karim Armanious, Karim Guirguis et al.

Automatic speech recognition (ASR) systems are of vital importance nowadays in commonplace tasks such as speech-to-text processing and language translation. This created the need for an ASR system that can operate in realistic crowded environments. Thus, speech enhancement is a valuable building block in ASR systems and other applications such as hearing aids, smartphones and teleconferencing systems. In this paper, a generative adversarial network (GAN) based framework is investigated for the task of speech enhancement, more specifically speech denoising of audio tracks. A new architecture based on CasNet generator and an additional feature-based loss are incorporated to get realistically denoised speech phonetics. Finally, the proposed framework is shown to outperform other learning and traditional model-based speech enhancement approaches.

IVOct 21, 2019
ipA-MedGAN: Inpainting of Arbitrary Regions in Medical Imaging

Karim Armanious, Vijeth Kumar, Sherif Abdulatif et al.

Local deformations in medical modalities are common phenomena due to a multitude of factors such as metallic implants or limited field of views in magnetic resonance imaging (MRI). Completion of the missing or distorted regions is of special interest for automatic image analysis frameworks to enhance post-processing tasks such as segmentation or classification. In this work, we propose a new generative framework for medical image inpainting, titled ipA-MedGAN. It bypasses the limitations of previous frameworks by enabling inpainting of arbitrary shaped regions without a prior localization of the regions of interest. Thorough qualitative and quantitative comparisons with other inpainting and translational approaches have illustrated the superior performance of the proposed framework for the task of brain MR inpainting.

IVOct 14, 2019
Organ-based Chronological Age Estimation based on 3D MRI Scans

Karim Armanious, Sherif Abdulatif, Anish Rao Bhaktharaguttu et al.

Individuals age differently depending on a multitude of different factors such as lifestyle, medical history and genetics. Often, the global chronological age is not indicative of the true ageing process. An organ-based age estimation would yield a more accurate health state assessment. In this work, we propose a new deep learning architecture for organ-based age estimation based on magnetic resonance images (MRI). The proposed network is a 3D convolutional neural network (CNN) with increased depth and width made possible by the hybrid utilization of inception and fire modules. We apply the proposed framework for the tasks of brain and knee age estimation. Quantitative comparisons against concurrent MR-based regression networks and different 2D and 3D data feeding strategies illustrated the superior performance of the proposed work.

IVOct 12, 2019
Unsupervised Adversarial Correction of Rigid MR Motion Artifacts

Karim Armanious, Aastha Tanwar, Sherif Abdulatif et al.

Motion is one of the main sources for artifacts in magnetic resonance (MR) images. It can have significant consequences on the diagnostic quality of the resultant scans. Previously, supervised adversarial approaches have been suggested for the correction of MR motion artifacts. However, these approaches suffer from the limitation of required paired co-registered datasets for training which are often hard or impossible to acquire. Building upon our previous work, we introduce a new adversarial framework with a new generator architecture and loss function for the unsupervised correction of severe rigid motion artifacts in the brain region. Quantitative and qualitative comparisons with other supervised and unsupervised translation approaches showcase the enhanced performance of the introduced framework.

CVMar 8, 2019
Unsupervised Medical Image Translation Using Cycle-MedGAN

Karim Armanious, Chenming Jiang, Sherif Abdulatif et al.

Image-to-image translation is a new field in computer vision with multiple potential applications in the medical domain. However, for supervised image translation frameworks, co-registered datasets, paired in a pixel-wise sense, are required. This is often difficult to acquire in realistic medical scenarios. On the other hand, unsupervised translation frameworks often result in blurred translated images with unrealistic details. In this work, we propose a new unsupervised translation framework which is titled Cycle-MedGAN. The proposed framework utilizes new non-adversarial cycle losses which direct the framework to minimize the textural and perceptual discrepancies in the translated images. Qualitative and quantitative comparisons against other unsupervised translation approaches demonstrate the performance of the proposed framework for PET-CT translation and MR motion correction.

CVMar 4, 2019
An Adversarial Super-Resolution Remedy for Radar Design Trade-offs

Karim Armanious, Sherif Abdulatif, Fady Aziz et al.

Radar is of vital importance in many fields, such as autonomous driving, safety and surveillance applications. However, it suffers from stringent constraints on its design parametrization leading to multiple trade-offs. For example, the bandwidth in FMCW radars is inversely proportional with both the maximum unambiguous range and range resolution. In this work, we introduce a new method for circumventing radar design trade-offs. We propose the use of recent advances in computer vision, more specifically generative adversarial networks (GANs), to enhance low-resolution radar acquisitions into higher resolution counterparts while maintaining the advantages of the low-resolution parametrization. The capability of the proposed method was evaluated on the velocity resolution and range-azimuth trade-offs in micro-Doppler signatures and FMCW uniform linear array (ULA) radars, respectively.

CVNov 17, 2018
Person Identification and Body Mass Index: A Deep Learning-Based Study on Micro-Dopplers

Sherif Abdulatif, Fady Aziz, Karim Armanious et al.

Obtaining a smart surveillance requires a sensing system that can capture accurate and detailed information for the human walking style. The radar micro-Doppler ($\boldsymbolμ$-D) analysis is proved to be a reliable metric for studying human locomotions. Thus, $\boldsymbolμ$-D signatures can be used to identify humans based on their walking styles. Additionally, the signatures contain information about the radar cross section (RCS) of the moving subject. This paper investigates the effect of human body characteristics on human identification based on their $\boldsymbolμ$-D signatures. In our proposed experimental setup, a treadmill is used to collect $\boldsymbolμ$-D signatures of 22 subjects with different genders and body characteristics. Convolutional autoencoders (CAE) are then used to extract the latent space representation from the $\boldsymbolμ$-D signatures. It is then interpreted in two dimensions using t-distributed stochastic neighbor embedding (t-SNE). Our study shows that the body mass index (BMI) has a correlation with the $\boldsymbolμ$-D signature of the walking subject. A 50-layer deep residual network is then trained to identify the walking subject based on the $\boldsymbolμ$-D signature. We achieve an accuracy of 98% on the test set with high signal-to-noise-ratio (SNR) and 84% in case of different SNR levels.

CVNov 12, 2018
Towards Adversarial Denoising of Radar Micro-Doppler Signatures

Sherif Abdulatif, Karim Armanious, Fady Aziz et al.

Generative Adversarial Networks (GANs) are considered the state-of-the-art in the field of image generation. They learn the joint distribution of the training data and attempt to generate new data samples in high dimensional space following the same distribution as the input. Recent improvements in GANs opened the field to many other computer vision applications based on improving and changing the characteristics of the input image to follow some given training requirements. In this paper, we propose a novel technique for the denoising and reconstruction of the micro-Doppler ($\boldsymbolμ$-D) spectra of walking humans based on GANs. Two sets of experiments were collected on 22 subjects walking on a treadmill at an intermediate velocity using a \unit[25]{GHz} CW radar. In one set, a clean $\boldsymbolμ$-D spectrum is collected for each subject by placing the radar at a close distance to the subject. In the other set, variations are introduced in the experiment setup to introduce different noise and clutter effects on the spectrum by changing the distance and placing reflective objects between the radar and the target. Synthetic paired noisy and noise-free spectra were used for training, while validation was carried out on the real noisy measured data. Finally, qualitative and quantitative comparison with other classical radar denoising approaches in the literature demonstrated the proposed GANs framework is better and more robust to different noise levels.

CVOct 15, 2018
Adversarial Inpainting of Medical Image Modalities

Karim Armanious, Youssef Mecky, Sergios Gatidis et al.

Numerous factors could lead to partial deteriorations of medical images. For example, metallic implants will lead to localized perturbations in MRI scans. This will affect further post-processing tasks such as attenuation correction in PET/MRI or radiation therapy planning. In this work, we propose the inpainting of medical images via Generative Adversarial Networks (GANs). The proposed framework incorporates two patch-based discriminator networks with additional style and perceptual losses for the inpainting of missing information in realistically detailed and contextually consistent manner. The proposed framework outperformed other natural image inpainting techniques both qualitatively and quantitatively on two different medical modalities.

CVSep 17, 2018
Retrospective correction of Rigid and Non-Rigid MR motion artifacts using GANs

Karim Armanious, Sergios Gatidis, Konstantin Nikolaou et al.

Motion artifacts are a primary source of magnetic resonance (MR) image quality deterioration with strong repercussions on diagnostic performance. Currently, MR motion correction is carried out either prospectively, with the help of motion tracking systems, or retrospectively by mainly utilizing computationally expensive iterative algorithms. In this paper, we utilize a new adversarial framework, titled MedGAN, for the joint retrospective correction of rigid and non-rigid motion artifacts in different body regions and without the need for a reference image. MedGAN utilizes a unique combination of non-adversarial losses and a new generator architecture to capture the textures and fine-detailed structures of the desired artifact-free MR images. Quantitative and qualitative comparisons with other adversarial techniques have illustrated the proposed model performance.

CVJun 17, 2018
MedGAN: Medical Image Translation using GANs

Karim Armanious, Chenming Jiang, Marc Fischer et al.

Image-to-image translation is considered a new frontier in the field of medical image analysis, with numerous potential applications. However, a large portion of recent approaches offers individualized solutions based on specialized task-specific architectures or require refinement through non-end-to-end training. In this paper, we propose a new framework, named MedGAN, for medical image-to-image translation which operates on the image level in an end-to-end manner. MedGAN builds upon recent advances in the field of generative adversarial networks (GANs) by merging the adversarial framework with a new combination of non-adversarial losses. We utilize a discriminator network as a trainable feature extractor which penalizes the discrepancy between the translated medical images and the desired modalities. Moreover, style-transfer losses are utilized to match the textures and fine-structures of the desired target images to the translated images. Additionally, we present a new generator architecture, titled CasNet, which enhances the sharpness of the translated medical outputs through progressive refinement via encoder-decoder pairs. Without any application-specific modifications, we apply MedGAN on three different tasks: PET-CT translation, correction of MR motion artefacts and PET image denoising. Perceptual analysis by radiologists and quantitative evaluations illustrate that the MedGAN outperforms other existing translation approaches.