Question: Linear Model Selection and Regularization You use the glmnet package to perform lasso regression. parsnip does not have a dedicated function to create a ridge

Linear Model Selection and Regularization

You use the glmnet package to perform lasso regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linear

_

reg

()

and set mixture

= 1

to specify a lasso model. The mixture argument specifies the amount of different types of regularization, mixture

= 0

specifies only ridge regularization and mixture

= 1

specifies only lasso regularization. Setting mixture to a value between

0

and

1

lets us use both.

The following procedure will be very similar to what we saw in the ridge regression section. The preprocessing needed is the same, but let us write it out again.

# Run this code from the previous assignment to get you properly started.

library

(

tidymodels

)

library

(

ISLR

2)

Hitters

< -

_

tibble

(

Hitters

) % > %

filter

(!

.

(

Salary

))

Hitters

_

split

< -

initial

_

split

(

Hitters

,

strata

=

"Salary"

)

Hitters

_

train

< -

training

(

Hitters

_

split

)

Hitters

_

test

< -

testing

(

Hitters

_

split

)

Hitters

_

fold

< -

vfold

_

(

Hitters

_

train, v

= 10)

Run the Block of code below

lasso

_

recipe

< -

recipe

(

formula

=

Salary ~

.,

data

=

Hitters

_

train

) % > %

step

_

novel

(

all

_

nominal

_

predictors

()) % > %

step

_

dummy

(

all

_

nominal

_

predictors

()) % > %

step

_

(

all

_

predictors

()) % > %

step

_

normalize

(

all

_

predictors

())

Next, finish the lasso regression workflow. Have the two outputs lasso

_

spec and lasso

_

workflow respectively. For the lasso

_

spec output use the functions linear

_

reg, set

_

mode and set

_

engine functions. For the lasso

_

workflow output use the add

_

recipe and add

_

model outputs.

lasso

_

spec

< -

linear

_

reg

(

penalty

=

tune

(),

mixture

= 1) % > %

set

_

mode

("

regression

") % > %

set

_

engine

("

glmnet

")

lasso

_

workflow

< -

workflow

() % > %

add

_

recipe

(

lasso

_

recipe

) % > %

add

_

model

(

lasso

_

spec

)

While you are doing a different kind of regularization you will still use the same penalty argument. I have picked a different range for the values of penalty since I know it will be a good range. You would in practice have to cast a wide net at first and then narrow on the range of interest. Use the output penalty

_

grid. Use

50

levels and set a range going from

[- 2, 2] .

Use the function grid

_

regular.

*

your code here

*

#penalty

_

grid

< -

penalty

_

grid

< -

grid

_

regular

(

penalty

(

tune

(),

levels

= 50,

range

=

(- 2, 2))

)

# your code here

Error in penalty

(

tune

(),

levels

= 50,

range

=

(- 2, 2))

: unused argument

(

levels

= 50)

Traceback:

1 .

grid

_

regular

(

penalty

(

tune

(),

levels

= 50,

range

=

(- 2, 2)))

library

(

testthat

)

expect

_

equal

(

penalty

_

grid$penalty

[1], 0.01)

expect

_

equal

(

penalty

_

grid$penalty

[25], 0.910298177991522)

expect

_

equal

(

penalty

_

grid$penalty

[50], 100)

You can tune

_

grid

()

again. Use the output tune

_

res along with the function tune

_

grid. Use autoplot to plot your tune

_

res outout. Your output should resemble this plot.

# your code here

Next, you should select the best value of penalty using select

_

best

() .

Your output variable here is best

_

penalty. Use

"

rsq

"

as the metric.

*

your code here

*

# best

_

penalty

< -

# your code here

You should now refit using the whole training data set. Your output variable should be lasso

_

final with the function finalize

_

workflow and your second output variable should be lasso

_

final

_

fit with the fit function.

# your code here

Finalize this by calculating the rsq value for the lasso model. You will see tha seee that for this data ridge regression does better than lasso regession. Verify this using the augment then the rsq function. Store the output to the variable rsq

_

val

*

your code here

*

# rsq

_

val

< -

augment

()

# your code here

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Linear Model Selection and Regularization This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning. These models contain a...

D Question 19 On large data sets, BIC generally penalizes complexity more than AIC and Mallow's Cp. True False D Question 20 1.5 pts The logit link is the only link function that yields s-shaped...

Cross - Validation Lab and Programming Assignment This Lab and programming Assignment begins with the Boston Data set, in which we apply Linear, Ridge and Lasso Regression to the data, eventually...

Background and Data Dictionary In this lab assignment, you will analyze data provided on Canvas under the file name "charitydata.xls." A charitable organization has enlisted your expertise to...

Using LASSO regression to build parsimonious model in R: The purpose of this assignment is to use Least Absolute Shrinkage and Selection Operator (LASSO) to perform regularization and variable...

Please provide Rscript from Rstudio. In Homework 4, you will use Machine Learning Models such as LASSO and Regression Tree to predict the number of college applications expected to receive for a...

[ 5 pts ] Load and preprocess the data using Pandas or Numpy and, if necessary, preprocessing functions from scikit - learn. For this problem you do not need to normalize or standardize the data....

Consider a Dutch pension fund that has a 150 million investment in a 95-year maturity Austrian government bond. This bond has a 2.10% coupon rate, a par value of 100 and a 1.20% yield. The coupon...

The management science staff of a grocery products manufacturer is developing a linear programming model for the production and distribution of its cereal products. The model requires...

The total of the net amount paid to employees each payday is credited to either the Cash account or the Salaries and Wages Payable account.

P-1) (100 Pts.) A chemical manufacturing company (CMC) has a contract for the procurement of the neccssaly chemicals from four suppliers. The chemicals purchased from Supplier A are priced at $20...

1. Identify three communication approaches to identity.

d. Who are important leaders and heroes of the group?

3. Describe phases of minority identity development.