A baseball analytics specialist wants to determine which variables are important in predicting a team’s wins in a given season. He has collected data related to wins, earned run average (ERA), and runs scored per game in a recent season (stored in Baseball). Develop a model to predict the number of wins based on ERA and runs scored per game.
a. State the multiple regression equation.
b. Interpret the meaning of the slopes in this equation.
c. Predict the mean number of wins for a team that has an ERA of 4.00 and has scored 4.0 runs per game.
d. Perform a residual analysis on the results and determine whether the regression assumptions are valid.
e. Is there a significant relationship between the number of wins and the two independent variables (ERA and runs scored per game) at the 0.05 level of significance?
f. Determine the p value in (e) and interpret its meaning.
g. Interpret the meaning of the coefficient of multiple determination in this problem.
h. Determine the adjusted r2.
i. At the 0.05 level of significance, determine whether each independent variable makes a significant contribution to the regression model. Indicate the most appropriate regression model for this set of data.
j. Determine the p values in (i) and interpret their meaning.
k. Construct a 95% confidence interval estimate of the population slope between wins and ERA.