Question: Linear Model Selection and Regularization You use the glmnet package to perform lasso regression. parsnip does not have a dedicated function to create a ridge

Linear Model Selection and Regularization
You use the glmnet package to perform lasso regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linear_reg() and set mixture =1 to specify a lasso model. The mixture argument specifies the amount of different types of regularization, mixture =0 specifies only ridge regularization and mixture =1 specifies only lasso regularization. Setting mixture to a value between 0 and 1 lets us use both.
The following procedure will be very similar to what we saw in the ridge regression section. The preprocessing needed is the same, but let us write it out again.
# Run this code from the previous assignment to get you properly started.
library(tidymodels)
library(ISLR2)
Hitters <- as_tibble(Hitters)%>%
filter(!is.na(Salary))
Hitters_split <- initial_split(Hitters, strata = "Salary")
Hitters_train <- training(Hitters_split)
Hitters_test <- testing(Hitters_split)
Hitters_fold <- vfold_cv(Hitters_train, v =10)
Run the Block of code below
lasso_recipe <-
recipe(formula = Salary ~ ., data = Hitters_train)%>%
step_novel(all_nominal_predictors())%>%
step_dummy(all_nominal_predictors())%>%
step_zv(all_predictors())%>%
step_normalize(all_predictors())
Next, finish the lasso regression workflow. Have the two outputs lasso_spec and lasso_workflow respectively. For the lasso_spec output use the functions linear_reg, set_mode and set_engine functions. For the lasso_workflow output use the add_recipe and add_model outputs.
lasso_spec <-
linear_reg(penalty = tune(), mixture =1)%>%
set_mode("regression")%>%
set_engine("glmnet")
lasso_workflow <- workflow()%>%
add_recipe(lasso_recipe)%>%
add_model(lasso_spec)
While you are doing a different kind of regularization you will still use the same penalty argument. I have picked a different range for the values of penalty since I know it will be a good range. You would in practice have to cast a wide net at first and then narrow on the range of interest. Use the output penalty_grid. Use 50 levels and set a range going from [-2,2]. Use the function grid_regular.
# *your code here*
#penalty_grid <-
penalty_grid <- grid_regular(
penalty(tune(), levels =50, range = c(-2,2))
)
# your code here
Error in penalty(tune(), levels =50, range = c(-2,2)): unused argument (levels =50)
Traceback:
1. grid_regular(penalty(tune(), levels =50, range = c(-2,2)))
library(testthat)
expect_equal(penalty_grid$penalty[1],0.01)
expect_equal(penalty_grid$penalty[25],0.910298177991522)
expect_equal(penalty_grid$penalty[50],100)
You can tune_grid() again. Use the output tune_res along with the function tune_grid. Use autoplot to plot your tune_res outout. Your output should resemble this plot.
# your code here
Next, you should select the best value of penalty using select_best(). Your output variable here is best_penalty. Use "rsq" as the metric.
# *your code here*
# best_penalty <-
# your code here
You should now refit using the whole training data set. Your output variable should be lasso_final with the function finalize_workflow and your second output variable should be lasso_final_fit with the fit function.
# your code here
Finalize this by calculating the rsq value for the lasso model. You will see tha seee that for this data ridge regression does better than lasso regession. Verify this using the augment then the rsq function. Store the output to the variable rsq_val
# *your code here*
# rsq_val <- augment()
# your code here

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!