Question: Pls help 8. We will now perform cross-validation on a simulated data set. (a) Generate a simulated data set as follows: > set . seed
Pls help

8. We will now perform cross-validation on a simulated data set. (a) Generate a simulated data set as follows: > set . seed (1) y=rnorm (100) x=rnorm (100) > y=x-2*x ^2+rnorm (100) In this data set, what is n and what is p? Write out the model used to generate the data in equation form. (b) Create a scatterplot of X against Y. Comment on what you find. (c) Set a random seed, and then compute the LOOCV errors that result from fitting the following four models using least squares: 5.4 Exercises 201 i. Y = Bo + BIX te ii. Y = Bo + BIX + 32X2 +e iii. Y = Bo + BIX + 32X2 + B3X3 + e iv. Y = Bo+ BIX + 32X2 + B3X3 + BAX* + E. Note you may find it helpful to use the data. frame () function to create a single data set containing both X and Y. (d) Repeat (c) using another random seed, and report your results. Are your results the same as what you got in (c)? Why? (e) Which of the models in (c) had the smallest LOOCV error? Is this what you expected? Explain your answer. (f) Comment on the statistical significance of the coefficient esti- mates that results from fitting each of the models in (c) using least squares. Do these results agree with the conclusions drawn based on the cross-validation results
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
