LGCYSep 11, 2020

Machine Learning and Data Science approach towards trend and predictors analysis of CDC Mortality Data for the USA

arXiv:2009.05400v1
Originality Synthesis-oriented
AI Analysis

This is an incremental analysis of mortality data for public health researchers, with limited practical impact.

The study analyzed CDC mortality data to identify trends and predictors, concluding that life expectancy and marital status affect death frequency, and that machine learning predictions were less viable than expected due to data imbalances.

The research on mortality is an active area of research for any country where the conclusions are driven from the provided data and conditions. The domain knowledge is an essential but not a mandatory skill (though some knowledge is still required) in order to derive conclusions based on data intuition using machine learning and data science practices. The purpose of conducting this project was to derive conclusions based on the statistics from the provided dataset and predict label(s) of the dataset using supervised or unsupervised learning algorithms. The study concluded (based on a sample) life expectancy regardless of gender, and their central tendencies; Marital status of the people also affected how frequent deaths were for each of them. The study also helped in finding out that due to more categorical and numerical data, anomaly detection or under-sampling could be a viable solution since there are possibilities of more class labels than the other(s). The study shows that machine learning predictions aren't as viable for the data as it might be apparent.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes