The Data Set (This Dataset is available on Kaggle.com) The data for the project comes from the
Fantastic news! We've Found the answer you've been seeking!
Question:
The Data Set (This Dataset is available on Kaggle.com)
The data for the project comes from the Global Health Observatory (GHO) data repository (under the World Health Organization) maintains. It gives data on life expectancy and associated health factors across 193 countries for the years 2000-2015. The data frame's dimension is 1649 by 17. (PS. I'm not sure how to attach dataset file here)
Please provide R programming codes and provide discussion details on the following bullet points:
- Checking for missing values in the data and if any are found describe and implement a plan for dealing with any missing values
- Examining the distribution shapes for the numerical variables graphically and numerically
- Investigating pairwise relationships between variables. The ggpairs function in the GGally library provides a very nice graphical display for this. It requires the ggplot2 library for its functionality
- After your data investigation split the data into two portions, one for training, the other for testing. Choose your own percentages for the split. You'll use these two sets for all model creation and testing to come
Posted Date: