Question: LINK TO DATA https://drive.google.com/open?id=0Bwos1D7Lt802TEVVZzJLSV9lRmtWZ1VmV1lickdsSDVTdFk0 In this assignment, you will again use the baseball salary data found in the Data Sets link. (1) Use the command
LINK TO DATA
https://drive.google.com/open?id=0Bwos1D7Lt802TEVVZzJLSV9lRmtWZ1VmV1lickdsSDVTdFk0
In this assignment, you will again use the baseball salary data found in theData Setslink.
(1) Use the commandleapsin the R packageleapsalong with the strategy discussed in class to choose a good subset of the 16 independent variables to include in a linear model.Describe fully the rationale you use in choosing your model.
(2) Plot standardized residuals from your model versus an index running from 1 to 337.Identify any players who have standardized residuals that are larger in absolute value than 3.Are these players different in any important way from most of the other players?
(3) Provide a plot of standardizedresidualsversus predicted values and comment on the plot.
(4) Provide a normal probability plot of standardized residuals and comment on the plot.
(5) Provide a plot of Cook's D values.Do any data points seem to be influential?Why or why not?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
