Question: How to solve Problem 2 by using R language? Problem 1 Boston House Data Recall in Project 2, Problem 3, we analyzed the Boston housing


How to solve Problem 2 by using R language?
Problem 1 Boston House Data Recall in Project 2, Problem 3, we analyzed the Boston housing data. The description of this data set can be found at: https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.names You can attach the Boston data set in the MASS package library (MASS) attach (Boston) For the Boston data set, we are interested in predicting whether a given suburb has a crime rate above or below the median given other information in the data. Recall, in Problem 3 of Project 2 we conducted the logistic regression analysis, in which, we used the first 405 rows as the training set and the rest as the test set. 1. For the logistic regression in Project 2, Problem 3 (a), perform forward and backward model selections Hint: you can use the step() function in R If you did the analysis in Project 2, Problem 3 (e), you can skip this part and refer to your previous analysis directly. 2. For the logistic regression in Project 2, Problem 3 (a), perform variable selection using the LASSO method. 3. Compare the full model aud selected models in (a) and (b) using 5-fold cross validation, which model is the best and why? Problem 2. Model Selection In this problem, you will generate simulated data, and will then use this data to perform best subset selection. 1. Use the rnorm function to generate a predictor X of length n 100 (u 5, 3), as well as a noise vector from N (0,1) of length n-100 2. Generate a response vector Y of length m 100 according to the model X-203 where Bo 3, B1 2, B2 0.25, and B3 2 For the following question, suppose that you know nothing about the generation of
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
