CVApr 10, 2023
Deepfake Detection of Occluded Images Using a Patch-based ApproachMahsa Soleimani, Ali Nazari, Mohsen Ebrahimi Moghaddam
DeepFake involves the use of deep learning and artificial intelligence techniques to produce or change video and image contents typically generated by GANs. Moreover, it can be misused and leads to fictitious news, ethical and financial crimes, and also affects the performance of facial recognition systems. Thus, detection of real or fake images is significant specially to authenticate originality of people's images or videos. One of the most important challenges in this topic is obstruction that decreases the system precision. In this study, we present a deep learning approach using the entire face and face patches to distinguish real/fake images in the presence of obstruction with a three-path decision: first entire-face reasoning, second a decision based on the concatenation of feature vectors of face patches, and third a majority vote decision based on these features. To test our approach, new datasets including real and fake images are created. For producing fake images, StyleGAN and StyleGAN2 are trained by FFHQ images and also StarGAN and PGGAN are trained by CelebA images. The CelebA and FFHQ datasets are used as real images. The proposed approach reaches higher results in early epochs than other methods and increases the SoTA results by 0.4\%-7.9\% in the different built data-sets. Also, we have shown in experimental results that weighing the patches may improve accuracy.
IVMar 18, 2023
Smart ROI Detection for Alzheimer's disease prediction using explainable AIAtefe Aghaei, Mohsen Ebrahimi Moghaddam
Purpose Predicting the progression of MCI to Alzheimer's disease is an important step in reducing the progression of the disease. Therefore, many methods have been introduced for this task based on deep learning. Among these approaches, the methods based on ROIs are in a good position in terms of accuracy and complexity. In these techniques, some specific parts of the brain are extracted as ROI manually for all of the patients. Extracting ROI manually is time-consuming and its results depend on human expertness and precision. Method To overcome these limitations, we propose a novel smart method for detecting ROIs automatically based on Explainable AI using Grad-Cam and a 3DCNN model that extracts ROIs per patient. After extracting the ROIs automatically, Alzheimer's disease is predicted using extracted ROI-based 3D CNN. Results We implement our method on 176 MCI patients of the famous ADNI dataset and obtain remarkable results compared to the state-of-the-art methods. The accuracy acquired using 5-fold cross-validation is 98.6 and the AUC is 1. We also compare the results of the ROI-based method with the whole brain-based method. The results show that the performance is impressively increased. Conclusion The experimental results show that the proposed smart ROI extraction, which extracts the ROIs automatically, performs well for Alzheimer's disease prediction. The proposed method can also be used for Alzheimer's disease classification and diagnosis.
CVFeb 8, 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image CaptioningMozhgan Pourkeshavarz, Shahabedin Nabavi, Mohsen Ebrahimi Moghaddam et al.
Recently, the attention-enriched encoder-decoder framework has aroused great interest in image captioning due to its overwhelming progress. Many visual attention models directly leverage meaningful regions to generate image descriptions. However, seeking a direct transition from visual space to text is not enough to generate fine-grained captions. This paper exploits a feature-compounding approach to bring together high-level semantic concepts and visual information regarding the contextual environment fully end-to-end. Thus, we propose a stacked cross-modal feature consolidation (SCFC) attention network for image captioning in which we simultaneously consolidate cross-modal features through a novel compounding function in a multi-step reasoning fashion. Besides, we jointly employ spatial information and context-aware attributes (CAA) as the principal components in our proposed compounding function, where our CAA provides a concise context-sensitive semantic representation. To make better use of consolidated features potential, we further propose an SCFC-LSTM as the caption generator, which can leverage discriminative semantic information through the caption generation process. The experimental results indicate that our proposed SCFC can outperform various state-of-the-art image captioning benchmarks in terms of popular metrics on the MSCOCO and Flickr30K datasets.
CVSep 18, 2025Code
A Race Bias Free Face Aging Model for Reliable Kinship VerificationAli Nazari, Bardiya Kariminia, Mohsen Ebrahimi Moghaddam
The age gap in kinship verification addresses the time difference between the photos of the parent and the child. Moreover, their same-age photos are often unavailable, and face aging models are racially biased, which impacts the likeness of photos. Therefore, we propose a face aging GAN model, RA-GAN, consisting of two new modules, RACEpSp and a feature mixer, to produce racially unbiased images. The unbiased synthesized photos are used in kinship verification to investigate the results of verifying same-age parent-child images. The experiments demonstrate that our RA-GAN outperforms SAM-GAN on an average of 13.14\% across all age groups, and CUSP-GAN in the 60+ age group by 9.1\% in terms of racial accuracy. Moreover, RA-GAN can preserve subjects' identities better than SAM-GAN and CUSP-GAN across all age groups. Additionally, we demonstrate that transforming parent and child images from the KinFaceW-I and KinFaceW-II datasets to the same age can enhance the verification accuracy across all age groups. The accuracy increases with our RA-GAN for the kinship relationships of father-son and father-daughter, mother-son, and mother-daughter, which are 5.22, 5.12, 1.63, and 0.41, respectively, on KinFaceW-I. Additionally, the accuracy for the relationships of father-daughter, father-son, and mother-son is 2.9, 0.39, and 1.6 on KinFaceW-II, respectively. The code is available at~\href{https://github.com/bardiya2254kariminia/An-Age-Transformation-whitout-racial-bias-for-Kinship-verification}{Github}
CVApr 26, 2025Code
Kinship Verification through a Forest Neural NetworkAli Nazari, Mohsen Ebrahimi Moghaddam, Omidreza Borzoei
Early methods used face representations in kinship verification, which are less accurate than joint representations of parents' and children's facial images learned from scratch. We propose an approach featuring graph neural network concepts to utilize face representations and have comparable results to joint representation algorithms. Moreover, we designed the structure of the classification module and introduced a new combination of losses to engage the center loss gradually in training our network. Additionally, we conducted experiments on KinFaceW-I and II, demonstrating the effectiveness of our approach. We achieved the best result on KinFaceW-II, an average improvement of nearly 1.6 for all kinship types, and we were near the best on KinFaceW-I. The code is available at https://github.com/ali-nazari/Kinship-Verification
CVDec 4, 2025
Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion ModelBita Baroutian, Atefe Aghaei, Mohsen Ebrahimi Moghaddam
Alcohol consumption is a significant public health concern and a major cause of accidents and fatalities worldwide. This study introduces a novel video-based facial sequence analysis approach dedicated to the detection of alcohol intoxication. The method integrates facial landmark analysis via a Graph Attention Network (GAT) with spatiotemporal visual features extracted using a 3D ResNet. These features are dynamically fused with adaptive prioritization to enhance classification performance. Additionally, we introduce a curated dataset comprising 3,542 video segments derived from 202 individuals to support training and evaluation. Our model is compared against two baselines: a custom 3D-CNN and a VGGFace+LSTM architecture. Experimental results show that our approach achieves 95.82% accuracy, 0.977 precision, and 0.97 recall, outperforming prior methods. The findings demonstrate the model's potential for practical deployment in public safety systems for non-invasive, reliable alcohol intoxication detection.
CVMar 26, 2025
Small Object Detection: A Comprehensive Survey on Challenges, Techniques and Real-World ApplicationsMahya Nikouei, Bita Baroutian, Shahabedin Nabavi et al.
Small object detection (SOD) is a critical yet challenging task in computer vision, with applications like spanning surveillance, autonomous systems, medical imaging, and remote sensing. Unlike larger objects, small objects contain limited spatial and contextual information, making accurate detection difficult. Challenges such as low resolution, occlusion, background interference, and class imbalance further complicate the problem. This survey provides a comprehensive review of recent advancements in SOD using deep learning, focusing on articles published in Q1 journals during 2024-2025. We analyzed challenges, state-of-the-art techniques, datasets, evaluation metrics, and real-world applications. Recent advancements in deep learning have introduced innovative solutions, including multi-scale feature extraction, Super-Resolution (SR) techniques, attention mechanisms, and transformer-based architectures. Additionally, improvements in data augmentation, synthetic data generation, and transfer learning have addressed data scarcity and domain adaptation issues. Furthermore, emerging trends such as lightweight neural networks, knowledge distillation (KD), and self-supervised learning offer promising directions for improving detection efficiency, particularly in resource-constrained environments like Unmanned Aerial Vehicles (UAV)-based surveillance and edge computing. We also review widely used datasets, along with standard evaluation metrics such as mean Average Precision (mAP) and size-specific AP scores. The survey highlights real-world applications, including traffic monitoring, maritime surveillance, industrial defect detection, and precision agriculture. Finally, we discuss open research challenges and future directions, emphasizing the need for robust domain adaptation techniques, better feature fusion strategies, and real-time performance optimization.
9.9CVApr 28
Generalized Disguise Makeup Presentation Attack Detection Using an Attention-Guided Patch-Based FrameworkFateme Taraghi, Atefe Aghaei, Mohsen Ebrahimi Moghaddam
Despite significant advances in facial recognition systems, they remain vulnerable to face presentation attacks. Among them, disguise makeup attacks are particularly challenging, as they use advanced cosmetics, prosthetic components, and artificial materials to realistically alter facial appearance, often making detection difficult even for humans. Despite their importance, this problem remains underexplored, and publicly available datasets are limited. To address this, we propose a generalized disguise makeup presentation attack detection framework. The method adopts a two-phase design in which a style-invariant full-face model, trained with metric learning and enhanced by a whitening transformation, extracts region attention scores via Grad-CAM. These scores guide a patch-based phase that performs localized analysis using region-specific subnetworks trained with metric learning for fine-grained discrimination. We also construct a new, diverse dataset of live and disguise makeup faces collected under real-world conditions, covering variations in subjects, environments, and disguise materials. Experimental results demonstrate strong generalization across both the collected dataset and SIW-Mv2, achieving 8.97% ACER and 9.76% EER on the collected dataset, and 0% ACER on Obfuscation and Impersonation and 1.34% on Cosmetics attacks of SIW-Mv2. The proposed method consistently outperforms prior works while maintaining robust performance across other spoof types.
IVAug 28, 2025
GENRE-CMR: Generalizable Deep Learning for Diverse Multi-Domain Cardiac MRI ReconstructionKian Anvari Hamedani, Narges Razizadeh, Shahabedin Nabavi et al.
Accelerated Cardiovascular Magnetic Resonance (CMR) image reconstruction remains a critical challenge due to the trade-off between scan time and image quality, particularly when generalizing across diverse acquisition settings. We propose GENRE-CMR, a generative adversarial network (GAN)-based architecture employing a residual deep unrolled reconstruction framework to enhance reconstruction fidelity and generalization. The architecture unrolls iterative optimization into a cascade of convolutional subnetworks, enriched with residual connections to enable progressive feature propagation from shallow to deeper stages. To further improve performance, we integrate two loss functions: (1) an Edge-Aware Region (EAR) loss, which guides the network to focus on structurally informative regions and helps prevent common reconstruction blurriness; and (2) a Statistical Distribution Alignment (SDA) loss, which regularizes the feature space across diverse data distributions via a symmetric KL divergence formulation. Extensive experiments confirm that GENRE-CMR surpasses state-of-the-art methods on training and unseen data, achieving 0.9552 SSIM and 38.90 dB PSNR on unseen distributions across various acceleration factors and sampling trajectories. Ablation studies confirm the contribution of each proposed component to reconstruction quality and generalization. Our framework presents a unified and robust solution for high-quality CMR reconstruction, paving the way for clinically adaptable deployment across heterogeneous acquisition protocols.
IVFeb 1, 2022
A generalizable approach based on U-Net model for automatic Intra retinal cyst segmentation in SD-OCT imagesRazieh Ganjee, Mohsen Ebrahimi Moghaddam, Ramin Nourinia
Intra retinal fluids or Cysts are one of the important symptoms of macular pathologies that are efficiently visualized in OCT images. Automatic segmentation of these abnormalities has been widely investigated in medical image processing studies. In this paper, we propose a new U-Net-based approach for Intra retinal cyst segmentation across different vendors that improves some of the challenges faced by previous deep-based techniques. The proposed method has two main steps: 1- prior information embedding and input data adjustment, and 2- IRC segmentation model. In the first step, we inject the information into the network in a way that overcomes some of the network limitations in receiving data and learning important contextual knowledge. And in the next step, we introduced a connection module between encoder and decoder parts of the standard U-Net architecture that transfers information more effectively from the encoder to the decoder part. Two public datasets namely OPTIMA and KERMANY were employed to evaluate the proposed method. Results showed that the proposed method is an efficient vendor-independent approach for IRC segmentation with mean Dice values of 0.78 and 0.81 on the OPTIMA and KERMANY datasets, respectively.
IVOct 1, 2020
Medical Imaging and Computational Image Analysis in COVID-19 Diagnosis: A ReviewShahabedin Nabavi, Azar Ejmalian, Mohsen Ebrahimi Moghaddam et al.
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. The disease presents with symptoms such as shortness of breath, fever, dry cough, and chronic fatigue, amongst others. Sometimes the symptoms of the disease increase so much they lead to the death of the patients. The disease may be asymptomatic in some patients in the early stages, which can lead to increased transmission of the disease to others. Many studies have tried to use medical imaging for early diagnosis of COVID-19. This study attempts to review papers on automatic methods for medical image analysis and diagnosis of COVID-19. For this purpose, PubMed, Google Scholar, arXiv and medRxiv were searched to find related studies by the end of April 2020, and the essential points of the collected studies were summarised. The contribution of this study is four-fold: 1) to use as a tutorial of the field for both clinicians and technologists, 2) to comprehensively review the characteristics of COVID-19 as presented in medical images, 3) to examine automated artificial intelligence-based approaches for COVID-19 diagnosis based on the accuracy and the method used, 4) to express the research limitations in this field and the methods used to overcome them. COVID-19 reveals signs in medical images can be used for early diagnosis of the disease even in asymptomatic patients. Using automated machine learning-based methods can diagnose the disease with high accuracy from medical images and reduce time, cost and error of diagnostic procedure. It is recommended to collect bulk imaging data from patients in the shortest possible time to improve the performance of COVID-19 automated diagnostic methods.
CVMar 17, 2019
A Weighted Multi-Criteria Decision Making Approach for Image CaptioningHassan Maleki Galandouz, Mohsen Ebrahimi Moghaddam, Mehrnoush Shamsfard
Image captioning aims at automatically generating descriptions of an image in natural language. This is a challenging problem in the field of artificial intelligence that has recently received significant attention in the computer vision and natural language processing. Among the existing approaches, visual retrieval based methods have been proven to be highly effective. These approaches search for similar images, then build a caption for the query image based on the captions of the retrieved images. In this study, we present a method for visual retrieval based image captioning, in which we use a multi criteria decision making algorithm to effectively combine several criteria with proportional impact weights to retrieve the most relevant caption for the query image. The main idea of the proposed approach is to design a mechanism to retrieve more semantically relevant captions with the query image and then selecting the most appropriate caption by imitation of the human act based on a weighted multi-criteria decision making algorithm. Experiments conducted on MS COCO benchmark dataset have shown that proposed method provides much more effective results in compare to the state-of-the-art models by using criteria with proportional impact weights .
CVFeb 24, 2014
A Novel Scheme for Intelligent Recognition of Pornographic ImagesSeyed Mostafa Kia, Hossein Rahmani, Reza Mortezaei et al.
Harmful contents are rising in internet day by day and this motivates the essence of more research in fast and reliable obscene and immoral material filtering. Pornographic image recognition is an important component in each filtering system. In this paper, a new approach for detecting pornographic images is introduced. In this approach, two new features are suggested. These two features in combination with other simple traditional features provide decent difference between porn and non-porn images. In addition, we applied fuzzy integral based information fusion to combine MLP (Multi-Layer Perceptron) and NF (Neuro-Fuzzy) outputs. To test the proposed method, performance of system was evaluated over 18354 download images from internet. The attained precision was 93% in TP and 8% in FP on training dataset, and 87% and 5.5% on test dataset. Achieved results verify the performance of proposed system versus other related works.