Fatemeh Vafaee

2.6LGSep 20, 2024

Multi-omics data integration for early diagnosis of hepatocellular carcinoma (HCC) using machine learning

Annette Spooner, Mohammad Karimi Moridani, Azadeh Safarchi et al.

The complementary information found in different modalities of patient data can aid in more accurate modelling of a patient's disease state and a better understanding of the underlying biological processes of a disease. However, the analysis of multi-modal, multi-omics data presents many challenges, including high dimensionality and varying size, statistical distribution, scale and signal strength between modalities. In this work we compare the performance of a variety of ensemble machine learning algorithms that are capable of late integration of multi-class data from different modalities. The ensemble methods and their variations tested were i) a voting ensemble, with hard and soft vote, ii) a meta learner, iii) a multi-modal Adaboost model using a hard vote, a soft vote and a meta learner to integrate the modalities on each boosting round, the PB-MVBoost model and a novel application of a mixture of experts model. These were compared to simple concatenation as a baseline. We examine these methods using data from an in-house study on hepatocellular carcinoma (HCC), along with four validation datasets on studies from breast cancer and irritable bowel disease (IBD). Using the area under the receiver operating curve as a measure of performance we develop models that achieve a performance value of up to 0.85 and find that two boosted methods, PB-MVBoost and Adaboost with a soft vote were the overall best performing models. We also examine the stability of features selected, and the size of the clinical signature determined. Finally, we provide recommendations for the integration of multi-modal multi-class data.

3.3SOC-PHMay 24, 2020

Forecasting the Spread of Covid-19 Under Control Scenarios Using LSTM and Dynamic Behavioral Models

Seid Miad Zandavi, Taha Hossein Rashidi, Fatemeh Vafaee

To accurately predict the regional spread of Covid-19 infection, this study proposes a novel hybrid model which combines a Long short-term memory (LSTM) artificial recurrent neural network with dynamic behavioral models. Several factors and control strategies affect the virus spread, and the uncertainty arisen from confounding variables underlying the spread of the Covid-19 infection is substantial. The proposed model considers the effect of multiple factors to enhance the accuracy in predicting the number of cases and deaths across the top ten most-affected countries and Australia. The results show that the proposed model closely replicates test data. It not only provides accurate predictions but also estimates the daily behavior of the system under uncertainty. The hybrid model outperforms the LSTM model accounting for limited available data. The parameters of the hybrid models were optimized using a genetic algorithm for each country to improve the prediction power while considering regional properties. Since the proposed model can accurately predict Covid-19 spread under consideration of containment policies, is capable of being used for policy assessment, planning and decision-making.

Fatemeh Vafaee

2 Papers