Question: Using R We now fit a GAM to predict Salary in the Hitters dataset. First, we remove the observations for whom the salary information is

Using R

We now fit a GAM to predict Salary in the Hitters dataset. First, we remove the observations for whom the salary information is unknown, and then we split the data set into a training set and a test set by using the following command lines.

library(ISLR)

data("Hitters")

Hitters <- Hitters[!is.na(Hitters$Salary),]

set.seed(10) train <- sample(nrow(Hitters), 200)

Hitters.train <- Hitters[train, ]

Hitters.test <- Hitters[-train, ]

(a)

Using log(Salary) (log-transformation of Salary) as response and the other variables as the predictors, perform forward stepwise selection on the training set in order to identify a satisfactory model that uses just a subset of the predictors.

(b)

Fit a GAM on the training data, using log(Salary) as the response and the features selected in the previous step as the predictors. Plot the results, and explain your findings.

(c)

Evaluate the model obtained on the test set. Try difference tuning parameters (if you are using smoothing splines s() then try different df's; if you are using local regression lo() then try different span's) and explain the results obtained.

(d)

For which variables, if any, is there evidence of a non-linear relationship with the response?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!