# Question: Recall that Table presents data concerning the need for labor

Recall that Table presents data concerning the need for labor in 16 U.S. Navy hospitals. This table gives values of the dependent variable Hours (monthly labor hours) and of the independent variables Xray (monthly X-ray exposures), BedDays (monthly occupied bed days— a hospital has one occupied bed day if one bed is occupied for an entire day), and Length (average length of patients’ stay, in days). The data in Table are part of a larger data set analyzed by the Navy. The complete data set includes two additional independent variables—Load (average daily patient load) and Pop (eligible population in the area, in thousands)—values of which are given in the page margin. Figure 14.30 gives Excel and MINITAB outputs of multicollinearity analysis and model building for the complete hospital labor needs data set.

a. (1) Find the three largest simple correlation coefficients among the independent variables and the three largest variance inflation factors in Figure 14.30(a) and (b). (2) Discuss why these statistics imply that the independent variables BedDays, Load, and Pop are most strongly involved in multicollinearity and thus contribute possibly redundant information for predicting Hours. Note that, although we have reasoned in Exercise 14.6(a) on page 535 that a negative coefficient (that is, least squares point estimate) for Length might be intuitively reasonable, the negative coefficients for Load and Pop [Figure 14.30(b)] are not intuitively reasonable and are a further indication of strong multicollinearity. We conclude that a final regression model for predicting Hours may not need all three of the potentially redundant independent variables BedDays, Load, and Pop.

b. Figure 14.30(c) indicates that the two best hospital labor needs models are the model using Xray, BedDays, Pop, and Length, which we will call Model 1, and the model using Xray, BedDays, and Length, which we will call Model 2. (1) Which model gives the smallest value of s and the largest value of? (2) Which model gives the smallest value of C? (3) Consider a questionable hospital for which Xray 56,194, BedDays 14,077.88, Pop 329.7, and Length 6.89. The 95 percent prediction intervals given by Models 1 and 2 for labor hours corresponding to this combination of values of the independent variables are, respectively, [14,888.43, 16,861.30] and [14,906.24, 16,886.26]. Which model gives the shortest prediction interval?

c. (1) Which model is chosen by stepwise regression in Figure 14.30(d)? (2) If we start with all five potential independent variables and use backward elimination with an αstay of .05, the procedure removes (in order) Load and Pop and then stops. Which model is chosen by backward elimination? (3) Overall, which model seems best? (4) Which of BedDays, Load, and Pop does this best model use?

a. (1) Find the three largest simple correlation coefficients among the independent variables and the three largest variance inflation factors in Figure 14.30(a) and (b). (2) Discuss why these statistics imply that the independent variables BedDays, Load, and Pop are most strongly involved in multicollinearity and thus contribute possibly redundant information for predicting Hours. Note that, although we have reasoned in Exercise 14.6(a) on page 535 that a negative coefficient (that is, least squares point estimate) for Length might be intuitively reasonable, the negative coefficients for Load and Pop [Figure 14.30(b)] are not intuitively reasonable and are a further indication of strong multicollinearity. We conclude that a final regression model for predicting Hours may not need all three of the potentially redundant independent variables BedDays, Load, and Pop.

b. Figure 14.30(c) indicates that the two best hospital labor needs models are the model using Xray, BedDays, Pop, and Length, which we will call Model 1, and the model using Xray, BedDays, and Length, which we will call Model 2. (1) Which model gives the smallest value of s and the largest value of? (2) Which model gives the smallest value of C? (3) Consider a questionable hospital for which Xray 56,194, BedDays 14,077.88, Pop 329.7, and Length 6.89. The 95 percent prediction intervals given by Models 1 and 2 for labor hours corresponding to this combination of values of the independent variables are, respectively, [14,888.43, 16,861.30] and [14,906.24, 16,886.26]. Which model gives the shortest prediction interval?

c. (1) Which model is chosen by stepwise regression in Figure 14.30(d)? (2) If we start with all five potential independent variables and use backward elimination with an αstay of .05, the procedure removes (in order) Load and Pop and then stops. Which model is chosen by backward elimination? (3) Overall, which model seems best? (4) Which of BedDays, Load, and Pop does this best model use?

## Answer to relevant Questions

According to the website of the American Association for Justice,11 Stella Liebeck of Albuquerque, New Mexico, was severely burned by McDonald’s coffee in February 1992. Liebeck, who re-ceived third- degree burns over 6 ...Explain what each of the following distribution shapes looks like. Then draw a picture that illustrates each shape. a. Symmetrical and mound shaped b. Double peaked c. Skewed to the right d. Skewed to the left The model using 12 squared and interaction variables has the smallest s. However, if we desire a somewhat simpler model, note that s does not increase substantially until we move from a model having seven squared and ...Enterprise Industries produces Fresh, a brand of liquid laundry detergent. In order to manage its inventory more effectively and make revenue projections, the company would like to better predict demand for Fresh. To develop ...In the article “The Effect of Promotion Timing on Major League Baseball Attendance” (Sport Marketing Quarterly, December 1999), T. C. Boyd and T. C. Krehbiel use data from six major league baseball teams having outdoor ...Post your question