Question: In this assignment you will continue to process the dataset ( diabetes . csv ) using linear regression models and scikit - learn libraries. Write

In this assignment you will continue to process the dataset (diabetes.csv) using linear regression models and scikit-learn libraries. Write a report on your observations.
Download disabetes_df.csv (you created during Assignment-1)
Create a Pandas dataframe from diabetes_df.csv and call it assignment2_df
Setup the Machine Learning Model:
Divide the data into features (X) array and target (y) array.
Split the dataset into 80-20,70-30, and 60-40 ratios. (Example: 80-20 means, 80% training data, 20% testing data, and so on.)
For each data split, apply logistic regression machine learning model to build confusion matrix and accuracy estimates.
Which data split is providing you the best accuracy?
For the selected data split in step 4, run the bootstrap analysis and calculate p-value, and confidence intervals.
Write a short report documenting your observations such as accuracy, threshold (for receiver operator curve, p-value for model acceptance, and a histogram showing the confidence intervals.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!