Mohammad Mahdi Ahmadi

h-index14

3papers

1,172citations

3 Papers

1.4CVAug 15, 2022

Elderly Fall Detection Using CCTV Cameras under Partial Occlusion of the Subjects Body

Sara Khalili, Hoda Mohammadzade, Mohammad Mahdi Ahmadi

One of the possible dangers that older people face in their daily lives is falling. Occlusion is one of the biggest challenges of vision-based fall detection systems and degrades their detection performance considerably. To tackle this problem, we synthesize specifically-designed occluded videos for training fall detection systems using existing datasets. Then, by defining a new cost function, we introduce a framework for weighted training of fall detection models using occluded and un-occluded videos, which can be applied to any learnable fall detection system. Finally, we use both a non-deep and deep model to evaluate the effect of the proposed weighted training method. Experiments show that the proposed method can improve the classification accuracy by 36% for a non-deep model and 55% for a deep model in occlusion conditions. Moreover, it is shown that the proposed training framework can also significantly improve the detection performance of a deep network on normal un-occluded samples.

CVJun 19

Memory-Augmented LSTM Autoencoder for Unsupervised Activity Recognition with IMU Sensor Fusion

Saeid Arabzadeh, Farshad Almasganj, Mohammad Mahdi Ahmadi

HAR using Inertial Measurement Unit (IMU) sensors is vital for healthcare monitoring and rehabilitation. Despite deep learning advancements, major challenges remain: reliance on labeled data, multi-sensor fusion complexity, and the limited ability of unsupervised methods to capture spatiotemporal dependencies. These issues are pronounced in real-world scenarios with noisy data, overlapping activities, and missing labels. We propose a fully unsupervised spatiotemporal feature fusion framework using a memory-augmented autoencoder. It enhances activity representations via short temporal windows of multi-sensor IMU data, enabling real-time applications. Our framework extracts hierarchical static features via a Stacked Autoencoder, fusing them within and across sensors. A sequence-to-sequence LSTM Autoencoder then temporally refines these features, incorporating historical motion patterns without labels. We analyze key hyperparameters to identify configurations that maximize feature separability under short-window constraints. Evaluated on DaLiAc and PAMAP2 using realistic inter-class window segmentation, our method achieves 96.6% and 98.4% accuracy, respectively, surpassing supervised baselines and unsupervised approaches. Our method improves feature separability by up to 9% despite shorter temporal windows. While our realistic inter-class segmentation reduces accuracy by ~7%, it was intentionally adopted to better reflect real-world activity transitions and practical relevance.

7.1LGFeb 6, 2025

CNN Autoencoders for Hierarchical Feature Extraction and Fusion in Multi-sensor Human Activity Recognition

Saeed Arabzadeh, Farshad Almasganj, Mohammad Mahdi Ahmadi

Deep learning methods have been widely used for Human Activity Recognition (HAR) using recorded signals from Iner-tial Measurement Units (IMUs) sensors that are installed on various parts of the human body. For this type of HAR, sev-eral challenges exist, the most significant of which is the analysis of multivarious IMU sensors data. Here, we introduce a Hierarchically Unsupervised Fusion (HUF) model designed to extract, and fuse features from IMU sensors data via a hybrid structure of Convolutional Neural Networks (CNN)s and Autoencoders (AE)s. First, we design a stack CNN-AE to embed short-time signals into sets of high dimensional features. Second, we develop another CNN-AE network to locally fuse the extracted features from each sensor unit. Finally, we unify all the sensor features through a third CNN-AE architecture as globally feature fusion to create a unique feature set. Additionally, we analyze the effects of varying the model hyperparameters. The best results are achieved with eight convolutional layers in each AE. Furthermore, it is determined that an overcomplete AE with 256 kernels in the code layer is suitable for feature extraction in the first block of the proposed HUF model; this number reduces to 64 in the last block of the model to customize the size of the applied features to the classifier. The tuned model is applied to the UCI-HAR, DaLiAc, and Parkinson's disease gait da-tasets, achieving the classification accuracies of 97%, 97%, and 88%, respectively, which are nearly 3% better com-pared to the state-of-the-art supervised methods.