IVFeb 5
MRI Cross-Modal Synthesis: A Comparative Study of Generative Models for T1-to-T2 ReconstructionAli Alqutayfi, Sadam Al-Azani
MRI cross-modal synthesis involves generating images from one acquisition protocol using another, offering considerable clinical value by reducing scan time while maintaining diagnostic information. This paper presents a comprehensive comparison of three state-of-the-art generative models for T1-to-T2 MRI reconstruction: Pix2Pix GAN, CycleGAN, and Variational Autoencoder (VAE). Using the BraTS 2020 dataset (11,439 training and 2,000 testing slices), we evaluate these models based on established metrics including Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM). Our experiments demonstrate that all models can successfully synthesize T2 images from T1 inputs, with CycleGAN achieving the highest PSNR (32.28 dB) and SSIM (0.9008), while Pix2Pix GAN provides the lowest MSE (0.005846). The VAE, though showing lower quantitative performance (MSE: 0.006949, PSNR: 24.95 dB, SSIM: 0.6573), offers advantages in latent space representation and sampling capabilities. This comparative study provides valuable insights for researchers and clinicians selecting appropriate generative models for MRI synthesis applications based on their specific requirements and data constraints.
GEO-PHDec 1, 2024
Well log data generation and imputation using sequence-based generative adversarial networksAbdulrahman Al-Fakih, A. Koeshidayatullah, Tapan Mukerji et al.
Well log analysis is crucial for hydrocarbon exploration, providing detailed insights into subsurface geological formations. However, gaps and inaccuracies in well log data, often due to equipment limitations, operational challenges, and harsh subsurface conditions, can introduce significant uncertainties in reservoir evaluation. Addressing these challenges requires effective methods for both synthetic data generation and precise imputation of missing data, ensuring data completeness and reliability. This study introduces a novel framework utilizing sequence-based generative adversarial networks (GANs) specifically designed for well log data generation and imputation. The framework integrates two distinct sequence-based GAN models: Time Series GAN (TSGAN) for generating synthetic well log data and Sequence GAN (SeqGAN) for imputing missing data. Both models were tested on a dataset from the North Sea, Netherlands region, focusing on different sections of 5, 10, and 50 data points. Experimental results demonstrate that this approach achieves superior accuracy in filling data gaps compared to other deep learning models for spatial series analysis. The method yielded R^2 values of 0.921, 0.899, and 0.594, with corresponding mean absolute percentage error (MAPE) values of 8.320, 0.005, and 151.154, and mean absolute error (MAE) values of 0.012, 0.005, and 0.032, respectively. These results set a new benchmark for data integrity and utility in geosciences, particularly in well log data analysis.
CVJun 4, 2025
Isharah: A Large-Scale Multi-Scene Dataset for Continuous Sign Language RecognitionSarah Alyami, Hamzah Luqman, Sadam Al-Azani et al.
Current benchmarks for sign language recognition (SLR) focus mainly on isolated SLR, while there are limited datasets for continuous SLR (CSLR), which recognizes sequences of signs in a video. Additionally, existing CSLR datasets are collected in controlled settings, which restricts their effectiveness in building robust real-world CSLR systems. To address these limitations, we present Isharah, a large multi-scene dataset for CSLR. It is the first dataset of its type and size that has been collected in an unconstrained environment using signers' smartphone cameras. This setup resulted in high variations of recording settings, camera distances, angles, and resolutions. This variation helps with developing sign language understanding models capable of handling the variability and complexity of real-world scenarios. The dataset consists of 30,000 video clips performed by 18 deaf and professional signers. Additionally, the dataset is linguistically rich as it provides a gloss-level annotation for all dataset's videos, making it useful for developing CSLR and sign language translation (SLT) systems. This paper also introduces multiple sign language understanding benchmarks, including signer-independent and unseen-sentence CSLR, along with gloss-based and gloss-free SLT. The Isharah dataset is available on https://snalyami.github.io/Isharah_CSLR/.