Question: Using R studio and the College data set from ISLR2 library answer the following questions: In this exercise, we will predict the number of applications

 Using R studio and the College data set from ISLR2 library
Using R studio and the College data set from ISLR2 library answer the following questions:

In this exercise, we will predict the number of applications received using the other variables in the College data set. We want to predict the number of college applications received using the predictors variables in the data. First, check the data and clean out n/ a values if needed. Split the data set into a training set and a test set. 1. Use the three methods: best subset, forward stepwise, and backward stepwise to choose the best model using the training set and use the trained model to predict the number of college applications in the testing set. Report the test error obtained. Make some plot of errors in training set to subport your results. b Fit a ridge regression model on the training set, with chosen by cross-validation. Use the trained model to predict the number of college applications in the testing set. Report the test error obtained. c. Fit a lasso model on the training set, with chosen by cross-validation. Use the trained model to predict the number of college applications in the testing set. Report the test error obtained, along with the number of non-zero coefficient estimates. d. Fit a PCR model on the training set, with M (component number) chosen by cross-validation. Use the trained model and M to predict the number of college applications in the testing set. Report the test error obtained, along with the value of M selected by cross-validation. Make some plot of errors in training set to subport your results. e. Fit a PLS model on the training set, with M chosen by cross-validation. Use the trained model and M to predict the number of college applications in the testing set. Report the test error obtained, along with the value of M selected by cross-validation. Make some plot of errors in training set to subport your results. 1. Summary the testing errors of the 7 models ( 3 in part 1 and 4 from 2-5) models in a table and give comments about the results: which model works best for the data, any suggestions... g. Fit the PCA model to the training set. Choose the optimal number of components that make up at least 85% of the variances to predict the number of college applications in the testing set. Compare the results with the results of the PLS and PCR in part f and 9

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!