Question: Part 1 (100 points) College data set is available in ISLR Library. Load the College data in the R environment by loading the ISLR library.

Part 1(100 points)

College data set is available in ISLR Library. Load the College data in the R environment by loading the ISLR library.

Description of College data set available at ISLR Library:

Statistics for a large number of US Colleges from the 1995 issue of US News and World Report.

A data frame with 777 observations on the following 18 variables.

Private A factor with levels No and Yes indicating private or public university

Apps Number of applications received

Accept Number of applications accepted

Enroll Number of new students enrolled

Top10perc Pct. new students from top 10% of H.S. class

Top25perc Pct. new students from top 25% of H.S. class

F.Undergrad Number of fulltime undergraduates

P.Undergrad Number of parttime undergraduates

Outstate Out-of-state tuition

Room.Board Room and board costs

Books Estimated book costs

Personal Estimated personal spending

PhD Pct. of faculty with Ph.D.'s

Terminal Pct. of faculty with terminal degree

S.F.Ratio Student/faculty ratio

perc.alumni Pct. alumni who donate

Expend Instructional expenditure per student

Grad.Rate Graduation rate

We will predict the number of applications received Apps using all other variables in the College data set and apply LASSO model.

PERFORM LASSO MODEL:

Predict the number of applications received Apps using all other variables in the College data set using LASSO model for variable selection:

a.Split the data set randomly into training and test data set. (10 points)

b.Fit Lasso model using glmnet() function on the training data set.(10 points)

c.Perform cross-validation on the training data set to choose the best lambda.(10 points)

d.Estimate the predicted values using the best lambda obtained in part (c) on the test data (using the predict() function) and compute test MSE. (20 points)

e.Compare the Lasso predicted test MSE with the null model (lambda=infinity) test MSE and least square regression model (lambda=0) test MSE. Provide a brief discussion on the comparison of the three test MSE obtained. (20 points)

f.Now construct the Lasso model for the entire data set and obtain the Lasso coefficients using the best lambda obtained in part (c) and report the number of non-zero coefficient estimates.(15 points)

g.Now use the Lasso predictors obtained in part (f) to fit the Linear Regression Model and report the summary of the linear model. (15 points)

Hint: You can refer to the program for "P2_LASSO_HittersData_OVERVIEW" as a guideline for the assignment.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!