Question: Question 3 Programming supervised learning In this question, you use an OpenML dataset of diabetes with ID 4 2 3 6 3 , https: /

Question 3 Programming supervised learning
In this question, you use an OpenML dataset of diabetes with ID 42363, https://www.openml.org/search?type=data&status=active&id=42363. The dataset can be loaded using the following code:
from sklearn.datasets import fetch_openml dataset = fetch_openml(data_id=42363)
(a) Write code to load and explore the dataset by printing its feature datas shape, targets shape, feature names, target names, and textual description. [4]
(b) Write code to create a pandas DataFrame from the dataset, and display the DataFrame and the DataFrames descriptive statistics. (Hint: use the display() function for better formatted output.)[4]
(c) Write code to train a linear regression model for the dataset. Use only the last 4 features in the dataset, i.e.temp,RH,wind and rain. Split the dataset into a training set and a test set, and do not use cross-validation. Display the test score of the model. [5]
(d) Write code to train a linear regression model for the dataset using 10-fold cross- validation. Use only the last 4 features in the dataset, i.e.temp,RH,wind and rain. Use neg_mean_absolute_error for scoring in CV. Display the mean score. [5]
(e) Write code to train a SVR model for the dataset using 10-fold cross-validation. Use only the last 4 features in the dataset, i.e.temp,RH,wind and rain. Use neg_mean_absolute_error for scoring in CV. Display the mean score. [5]
(f) State the scores of the models in parts (d) and (e), and comment on which of the two models performs better. [2]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!