# Question: The general manager of a major league baseball team would

The general manager of a major league baseball team would like to develop a regression model to predict the number of wins during the season by a starting pitcher. The Excel file MLB pitchers.xlsx provides the following data on a random sample of starting pitchers from a recent season:

• Wins

• Average walks and hits per innings pitched (WHIP)

• Average strikeouts per nine innings (K/ 9)

• Average strikeout to walk ratio (K/ BB)

• Earned run average (ERA)— the average number of earned runs given up per game

• Average pitches per plate appearance (P/ PA)

• Average pitches per inning (P/ IP)

• The ground ball to fly ball ratio (G/ F)— pitchers who have higher G/ F ratios tend to cause batters to hit the ball on the ground rather than the air

• Run support average (RS)— the average number of runs scored by the pitcher’s team per start

• Right handed or left handed pitcher (R/ L)

a. Check for the presence of multicollinearity between the independent variables. If it is present, take the necessary steps to eliminate it.

b. Construct a regression model using a best subsets regression that predicts the average number of wins for a pitcher using the independent variables from part a.

c. Interpret the meaning of the regression coefficients from part b.

d. Construct a 99% confidence interval for the regression coefficients for the run support variable from part b. Be sure to interpret the meaning of this confidence interval.

e. Predict the average number of wins for a left handed pitcher who averages 1.2 walks and hits per inning, 7.1 strikeouts per game, 3.8 pitches per plate appearances, 15.2 pitches per inning, a ground ball to fly ball ratio of 0.8, a strikeout to walk ratio of 2.5, and an earned run average of 3.6 runs per game and whose team averages 5.3 runs per game during his starts.

f. Perform a residual analysis to verify that the conditions for the regression model are met for the model developed in part b.

g. The general manager would like to add a new starting pitcher to his team’s roster. Using the results of this model, should he pursue a pitcher that has a high strikeout to walk ratio or a high ground ball to fly ball ratio? Explain your choice.

• Wins

• Average walks and hits per innings pitched (WHIP)

• Average strikeouts per nine innings (K/ 9)

• Average strikeout to walk ratio (K/ BB)

• Earned run average (ERA)— the average number of earned runs given up per game

• Average pitches per plate appearance (P/ PA)

• Average pitches per inning (P/ IP)

• The ground ball to fly ball ratio (G/ F)— pitchers who have higher G/ F ratios tend to cause batters to hit the ball on the ground rather than the air

• Run support average (RS)— the average number of runs scored by the pitcher’s team per start

• Right handed or left handed pitcher (R/ L)

a. Check for the presence of multicollinearity between the independent variables. If it is present, take the necessary steps to eliminate it.

b. Construct a regression model using a best subsets regression that predicts the average number of wins for a pitcher using the independent variables from part a.

c. Interpret the meaning of the regression coefficients from part b.

d. Construct a 99% confidence interval for the regression coefficients for the run support variable from part b. Be sure to interpret the meaning of this confidence interval.

e. Predict the average number of wins for a left handed pitcher who averages 1.2 walks and hits per inning, 7.1 strikeouts per game, 3.8 pitches per plate appearances, 15.2 pitches per inning, a ground ball to fly ball ratio of 0.8, a strikeout to walk ratio of 2.5, and an earned run average of 3.6 runs per game and whose team averages 5.3 runs per game during his starts.

f. Perform a residual analysis to verify that the conditions for the regression model are met for the model developed in part b.

g. The general manager would like to add a new starting pitcher to his team’s roster. Using the results of this model, should he pursue a pitcher that has a high strikeout to walk ratio or a high ground ball to fly ball ratio? Explain your choice.

## Answer to relevant Questions

The provost at a major university would like to develop a model to examine the relationship between the salaries of full time associate professors at the institution and the following independent variables: an associate ...A finance executive would like to determine if a relationship exists between the current earnings per share (EPS) of a bank and the following independent variables: • Total assets ($ billions) • Previous period’s ...Consider the following time series: a. Using a trend projection, forecast the demand for Period 9. b. Verify your results with PHStat. c. Calculate the MAD for this forecast. Gold has long been a very popular investment choice during times of economic crisis. This has resulted in a significant increase in the price of gold per ounce in recent years. The Excel file gold prices.xlsx shows the ...Back in the late 1990s, AOL was the dominant Internet provider. It charged an hourly rate for online access. However, AOL was slow to respond to changes in the Internet business model and lost a significant amount of market ...Post your question