Question: 2. You will now fit a linear regression model to thediabetesdata set, and evaluate its pre-dictive performance. Work with theLinearRegressionclass insklearn.linear_model. (a) Fit aLinearRegressionmodel to
2. You will now fit a linear regression model to thediabetesdata set, and evaluate its pre-dictive performance. Work with theLinearRegressionclass insklearn.linear_model.
(a) Fit aLinearRegressionmodel to thediabetesdata set. Report thersquaredgoodness of fit metric for the model after fitting, as well as the fraction of instancesfor which the models predictions match the corresponding target values exactly.Compute the latter value directly innumpy. Discuss the results, comparing theoverall assessments implied by the two metrics. Do the two metrics paint a similarpicture about the quality of the model? Explain any disagreements between thetwo. Include the text of your code, the results, and discussion in your writeup.
(b) Split the list of row indices of all of the instances of the data set randomly intotwo non-overlapping parts, one of which has size 300; the other part will thenbe about half that size. Do this by usingrandom.shuffleon the sorted list ofindices, and then simply taking head and tail segments of the appropriate lengths.The set of 300 indices will be called thetraining set, and the other one thetestset. Fit aLinearRegressionmodel on the data instances corresponding to therow numbers in the training set. Report twor2goodness of fit values for thismodel: first, ther2value for the training instances; second, ther2value for thetest instances. Discuss the results. Are ther2values the same? If not, which ofthe two represents a better fit? Is the result surprising?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
