Question: Problem 4: Predictive Analytics - Model building, Training, & Assessment [70 pts] In this predictive analytics problem, we will build a LINEAR python model based
Problem 4: Predictive Analytics - Model building, Training, & Assessment [70 pts] In this predictive analytics problem, we will build a LINEAR python model based on a given dataset and a prediction objective. Once the model is developed, we will train the model on a portion of the data set to develop this predictive ability. Next, we will use this predictive capability to do actual prediction on the remain dataset. As a final ask, we will evaluate how accurate these predictions were by comparing the predicted values and the actual values from the dataset. This will give us the accuracy of the predictive capability of our model. Target Dataset File: health_insurance_cost.csv (Provided as a separate file) Data Description: This file contains various attributes of an individual and his/her insurance cost. Insurance premium depends on many different factors and this data is a simplified and small version of various client profiles along with their premiums. Prediction objective: Since we are building a single linear predictive model, we will use a SINGLE variable to predict the outcome. In this problem you have the option to use either AGE or BMI (Not both) of the individual to predict the insurance premium. We will ignore all other factors. In a more complex model, we will include all variables to predict the outcome. Tasks sequence: A. Load Data File B. Examine the data C. Clean the dataset D. Handle missing data & remove any unnecessary columns E. Build Model F. Split data and train model MIS 4390 Business Analytics FALL 2021 Page 7 of 7 G. Test model H. Report & Analyze model efficiency
health_insurance_cost.csv (Provided as a separate file) (Portion of the excel file data)
| age | sex | bmi | children | smoker | region | charges |
| 19 | female | 27.9 | 0 | yes | southwest | 16884.92 |
| 18 | male | 33.77 | 1 | no | southeast | 1725.552 |
| 28 | male | 33 | 3 | no | southeast | 4449.462 |
| 33 | male | 22.705 | 0 | no | northwest | 21984.47 |
| 32 | male | 28.88 | 0 | no | northwest | 3866.855 |
| 31 | female | 25.74 | 0 | no | southeast | 3756.622 |
| 46 | female | 33.44 | 1 | no | southeast | 8240.59 |
| 37 | female | 27.74 | 3 | no | northwest | 7281.506 |
| 37 | male | 29.83 | 2 | no | northeast | 6406.411 |
| 60 | female | 25.84 | 0 | no | northwest | 28923.14 |
| 25 | male | 26.22 | 0 | no | northeast | 2721.321 |
| 62 | female | 26.29 | 0 | yes | southeast | 27808.73 |
| 23 | male | 34.4 | 0 | no | southwest | 1826.843 |
| 56 | female | 39.82 | 0 | no | southeast | 11090.72 |
| 27 | male | 42.13 | 0 | yes | southeast | 39611.76 |
| 19 | male | 24.6 | 1 | no | southwest | 1837.237 |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
