Vandad Davoodnia

h-index9

7papers

233citations

Novelty41%

AI Score27

Ranked #154,490 of 194,257 authors (top 80%)#50,246 in CV (top 85%)

7 Papers

5.7CVJun 13, 2022

Estimating Pose from Pressure Data for Smart Beds with Deep Image-based Pose Estimators

Vandad Davoodnia, Saeed Ghorbani, Ali Etemad

In-bed pose estimation has shown value in fields such as hospital patient monitoring, sleep studies, and smart homes. In this paper, we explore different strategies for detecting body pose from highly ambiguous pressure data, with the aid of pre-existing pose estimators. We examine the performance of pre-trained pose estimators by using them either directly or by re-training them on two pressure datasets. We also explore other strategies utilizing a learnable pre-processing domain adaptation step, which transforms the vague pressure maps to a representation closer to the expected input space of common purpose pose estimation modules. Accordingly, we used a fully convolutional network with multiple scales to provide the pose-specific characteristics of the pressure maps to the pre-trained pose estimation module. Our complete analysis of different approaches shows that the combination of learnable pre-processing module along with re-training pre-existing image-based pose estimators on the pressure data is able to overcome issues such as highly vague pressure points to achieve very high pose estimation accuracy.

5.0CVMar 10, 2023

Human Pose Estimation from Ambiguous Pressure Recordings with Spatio-temporal Masked Transformers

Vandad Davoodnia, Ali Etemad

Despite the impressive performance of vision-based pose estimators, they generally fail to perform well under adverse vision conditions and often don't satisfy the privacy demands of customers. As a result, researchers have begun to study tactile sensing systems as an alternative. However, these systems suffer from noisy and ambiguous recordings. To tackle this problem, we propose a novel solution for pose estimation from ambiguous pressure data. Our method comprises a spatio-temporal vision transformer with an encoder-decoder architecture. Detailed experiments on two popular public datasets reveal that our model outperforms existing solutions in the area. Moreover, we observe that increasing the number of temporal crops in the early stages of the network positively impacts the performance while pre-training the network in a self-supervised setting using a masked auto-encoder approach also further improves the results.

5.2CVApr 19, 2024

SkelFormer: Markerless 3D Pose and Shape Estimation using Skeletal Transformers

Vandad Davoodnia, Saeed Ghorbani, Alexandre Messier et al.

We introduce SkelFormer, a novel markerless motion capture pipeline for multi-view human pose and shape estimation. Our method first uses off-the-shelf 2D keypoint estimators, pre-trained on large-scale in-the-wild data, to obtain 3D joint positions. Next, we design a regression-based inverse-kinematic skeletal transformer that maps the joint positions to pose and shape representations from heavily noisy observations. This module integrates prior knowledge about pose space and infers the full pose state at runtime. Separating the 3D keypoint detection and inverse-kinematic problems, along with the expressive representations learned by our skeletal transformer, enhance the generalization of our method to unseen noisy data. We evaluate our method on three public datasets in both in-distribution and out-of-distribution settings using three datasets, and observe strong performance with respect to prior works. Moreover, ablation experiments demonstrate the impact of each of the modules of our architecture. Finally, we study the performance of our method in dealing with noise and heavy occlusions and find considerable robustness with respect to other solutions.

7.6CVApr 23, 2024

UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues

Vandad Davoodnia, Saeed Ghorbani, Marc-André Carbonneau et al.

We introduce UPose3D, a novel approach for multi-view 3D human pose estimation, addressing challenges in accuracy and scalability. Our method advances existing pose estimation frameworks by improving robustness and flexibility without requiring direct 3D annotations. At the core of our method, a pose compiler module refines predictions from a 2D keypoints estimator that operates on a single image by leveraging temporal and cross-view information. Our novel cross-view fusion strategy is scalable to any number of cameras, while our synthetic data generation strategy ensures generalization across diverse actors, scenes, and viewpoints. Finally, UPose3D leverages the prediction uncertainty of both the 2D keypoint estimator and the pose compiler module. This provides robustness to outliers and noisy data, resulting in state-of-the-art performance in out-of-distribution settings. In addition, for in-distribution settings, UPose3D yields performance rivalling methods that rely on 3D annotated data while being the state-of-the-art among methods relying only on 2D supervision.

7.5LGApr 5, 2021

Identity and Posture Recognition in Smart Beds with Deep Multitask Learning

Vandad Davoodnia, Ali Etemad

Sleep posture analysis is widely used for clinical patient monitoring and sleep studies. Earlier research has revealed that sleep posture highly influences symptoms of diseases such as apnea and pressure ulcers. In this study, we propose a robust deep learning model capable of accurately detecting subjects and their sleeping postures using the publicly available data acquired from a commercial pressure mapping system. A combination of loss functions is used to discriminate subjects and their sleeping postures simultaneously. The experimental results show that our proposed method can identify the patients and their in-bed posture with almost no errors in a 10-fold cross-validation scheme. Furthermore, we show that our network achieves an average accuracy of up to 99% when faced with new subjects in a leave-one-subject-out validation procedure on the three most common sleeping posture categories. We demonstrate the effects of the combined cost function over its parameter and show that learning both tasks simultaneously improves performance significantly. Finally, we evaluate our proposed pipeline by testing it over augmented images of our dataset. The proposed algorithm can ultimately be used in clinical and smart home environments as a complementary tool with other available automated patient monitoring systems.

7.6CVAug 21, 2019

In-bed Pressure-based Pose Estimation using Image Space Representation Learning

Vandad Davoodnia, Saeed Ghorbani, Ali Etemad

Recent advances in deep pose estimation models have proven to be effective in a wide range of applications such as health monitoring, sports, animations, and robotics. However, pose estimation models fail to generalize when facing images acquired from in-bed pressure sensing systems. In this paper, we address this challenge by presenting a novel end-to-end framework capable of accurately locating body parts from vague pressure data. Our method exploits the idea of equipping an off-the-shelf pose estimator with a deep trainable neural network, which pre-processes and prepares the pressure data for subsequent pose estimation. Our model transforms the ambiguous pressure maps to images containing shapes and structures similar to the common input domain of the pre-existing pose estimation methods. As a result, we show that our model is able to reconstruct unclear body parts, which in turn enables pose estimators to accurately and robustly estimate the pose. We train and test our method on a manually annotated public pressure map dataset using a combination of loss functions. Results confirm the effectiveness of our method by the high visual quality in the generated images and the high pose estimation rates achieved.

12.5LGAug 6, 2019

Classification of Hand Movements from EEG using a Deep Attention-based LSTM Network

Guangyi Zhang, Vandad Davoodnia, Alireza Sepas-Moghaddam et al.

Classifying limb movements using brain activity is an important task in Brain-computer Interfaces (BCI) that has been successfully used in multiple application domains, ranging from human-computer interaction to medical and biomedical applications. This paper proposes a novel solution for classification of left/right hand movement by exploiting a Long Short-Term Memory (LSTM) network with attention mechanism to learn the electroencephalogram (EEG) time-series information. To this end, a wide range of time and frequency domain features are extracted from the EEG signals and used to train an LSTM network to perform the classification task. We conduct extensive experiments with the EEG Movement dataset and show that our proposed solution our method achieves improvements over several benchmarks and state-of-the-art methods in both intra-subject and cross-subject validation schemes. Moreover, we utilize the proposed framework to analyze the information as received by the sensors and monitor the activated regions of the brain by tracking EEG topography throughout the experiments.