Question: Linear Model Selection and Regularization This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning.

Linear Model Selection and Regularization

This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning. These models contain a regularization term. This assignment will use parsnip for model fitting and recipes and workflows to perform the transformations, and tune and dials to tune the hyperparameters of the model.

You will be using the Hitters data set from the ISLR package. You wish to predict the baseball players Salary based on several different characteristics which are included in the data set.

Since you wish to predict Salary, then you need to remove any missing data from that column. Otherwise, you won't be able to run the models.

Set output as Hitters

library

(

tidymodels

)

library

(

ISLR

2)

# Your code here

# Hitters

< -

Hitters

< -

ISLR

2

::Hitters

Hitters

< -

Hitters

% > %

drop

_

(

Salary

)

# your code here

Attaching packages

tidymodels

1.0.0

broom

1.0.4

recipes

1.0.5

dials

1.1.0

rsample

1.1.1

dplyr

1.1.0

tibble

3.2.0

ggplot

2 3.4.1

tidyr

1.3.0

infer

1.0.4

tune

1.0.1

modeldata

1.1.0

workflows

1.1.3

parsnip

1.0.4

workflowsets

1.0.0

purrr

1.0.1

yardstick

1.1.0

Conflicts

tidymodels

_

conflicts

()

purrr::discard

()

masks scales::discard

()

dplyr::filter

()

masks stats::filter

()

dplyr::lag

()

masks stats::lag

()

recipes::step

()

masks stats::step

()

Use suppressPackageStartupMessages

()

to eliminate package startup messages

# Hidden Tests

You will use the glmnet package to perform ridge regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linear

_

reg

()

and set mixture

= 0

to specify a ridge model. The mixture argument specifies the amount of different types of regularization, mixture

= 0

specifies only ridge regularization and mixture

= 1

specifies only lasso regularization.

Setting mixture to a value between

0

and

1

lets us use both. When using the glmnet engine you also need to set a penalty to be able to fit the model. You will set this value to

0

for now, it is not the best value, but you will look at how to select the best value in a little bit.

ridge

_

spec

< -

linear

_

reg

(

mixture

= 0,

penalty

= 0) % > %

set

_

mode

("

regression

") % > %

set

_

engine

("

glmnet

")

Once the specification is created you can fit it to you data. You will use all the predictors. Use the fit function here.

ridge

_

fit

< -

fit

(

ridge

_

spec, Salary ~

.,

data

=

Hitters

)

The glmnet package will fit the model for all values of penalty at once, so you can now see see what the parameter estimate for the model is now that you have penalty

= 0 .

You can use the tidy function to accomplish this specific task.

tidy

(

ridge

_

fit

)

Loading required package: Matrix

Attaching package:

Matrix

The following objects are masked from

package:tidyr

expand, pack, unpack

Loaded glmnet

4.1 - 6

A tibble:

20 \

times

3

term estimate penalty

(

Intercept

) 8.112693

+ 01 0

AtBat

- 6.815959

- 01 0

Hits

2.772312

+ 00 0

HmRun

- 1.365680

+ 00 0

Runs

1.014826

+ 00 0

RBI

7.130224

- 01 0

Walks

3.378558

+ 00 0

Years

- 9.066800

+ 00 0

CAtBat

- 1.199478

- 03 0

CHits

1.361029

- 01 0

CHmRun

6.979958

- 01 0

CRuns

2.958896

- 01 0

CRBI

2.570711

- 01 0

CWalks

- 2.789666

- 01 0

LeagueN

5.321272

+ 01 0

DivisionW

- 1.228345

+ 02 0

PutOuts

2.638876

- 01 0

Assists

1.698796

- 01 0

Errors

- 3.685645

+ 00 0

NewLeagueN

- 1.810510

+ 01 0

Let us instead see what the estimates would be if the penalty was

11498 .

Store your output to tidy

2 .

What do you notice?

# Your code here

# tidy

2 < -

tidy

2 < -

tidy

(

linear

_

reg

(

penalty

= 11498,

mixture

= 0) % > %

set

_

mode

("

regression

") % > %

set

_

engine

("

glmnet

") % > %

fit

(

Salary ~

.,

data

=

Hitters

)

)

# Print the parameter estimates for penalty

= 11498

tidy

2

# your code here

A tibble:

20 \

times

3

term estimate penalty

(

Intercept

) 407.205936774 11498

AtBat

0.037003083 11498

Hits

0.138357552 11498

HmRun

0.525195508 11498

Runs

0.230978290 11498

RBI

0.240114775 11498

Walks

0.289971555 11498

Years

1.108832399 11498

CAtBat

0.003135215 11498

CHits

0.011666684 11498

CHmRun

0.087642789 11498

CRuns

0.023406258 11498

CRBI

0.024165723 11498

CWalks

0.025042117 11498

LeagueN

0.086629234 11498

DivisionW

- 6.225431332 11498

PutOuts

0.016506596 11498

Assists

0.002616335 11498

Errors

- 0.020564158 11498

NewLeagueN

0.302922899 11498

# Hidden Tests

Look below at the parameter estimates for penalty

= 705 .

Store your output to tidy

3 .

Once again, use the tidy function to accomplish this task.

# Your code here

# tidy

3 < -

tidy

3 < -

tidy

(

linear

_

reg

(

penalty

= 705,

mixture

= 0) % > %

set

_

mode

("

regression

") % > %

set

_

engine

("

glmnet

") % > %

fit

(

Salary ~

.,

data

=

Hitters

)

)

# Print the parameter estimates for penalty

= 705

tidy

3

# your code here

A tibble:

20 \

times

3

term estimat

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Linear Model Selection and Regularization This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning. These models contain a...

Solve all parts with code The google colab code/file is : { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Linear Regression for Red Wine Quality Classification" ] }, {...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Could you please explain the findings of the study? A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models Evangelia...

For the jobs below please create for me three amazing cover letter based on my attached job experience and bio and make them great essays to help me stand out, thanks! I want a thorough and detailed...

Google colab assignment. Pls do .ipynb part in colab and word doc part in written word form. Thank you! Lab 3.2: Lab 4: In Lab 3.2 and Lab 4, we have learnt to build CNN model and to analyze of model...

Question:DEPARTMENT OF SUPPLY CHAIN & INFORMATION SYSTEMS MANAGEMENT INFORMATION SYSTEMS & EBUSINESS Question 1. Perform a critique on the following journal article: Schlagwein, Daniel and Hu, Monica...

OPERATIONS MANAGEMENT SIMULATION Give definitions or descriptions of Hard OR and Soft OR and discuss some examples of each. (Your answer should not be more than approximately two A4 pages.)...

Jupiter Notebook We have covered some of the limitations of single layer neural networks in class, but they are still powerful learning systems that provide a good way to begin learning about how to...

Atwater Manufacturing Co. leases its equipment from Westside Leasing Company. In each of the following cases, assuming none of the other criteria for capitalizing leases are met, determine whether...

If y1(x) = x is a solution of the following ODE use the reduction of order method to obtain an ODE for another solution y2(x) = x v(x) and v'(x) = u(x) Equation for u xu' + u(x) = v(x) = O Ty"-xy' +...

( 2 ) Use the correct formula above to find the derivative of the function f ( x ) = ( x 3 + 1 0 ) sqrt x .

P-1) (100 Pts.) A chemical manufacturing company (CMC) has a contract for the procurement of the neccssaly chemicals from four suppliers. The chemicals purchased from Supplier A are priced at $20...

(Appendices) SALES RETURNS. Swan and Bloom, Inc., is a wholesaler of novelty items to small stores. All sales are on credit with no discount offered. During March, Swan and Bloom accepted the...

(Appendices) ACCOUNTS RECEIVABLE TURNOVER. A recent annual report for Gerber Products Company shows credit sales for the year of approximately $1,269 million and average accounts receivable of...

(Appendices) FINANCIAL ANALYSIS OF RECEIVABLES. A chain of retail stores located in Kansas and Nebraska has requested a loan from First Chicago. The balance sheet of the retail chain shows...