Recall that Table presents data concerning the need for labor in 16 U.S. Navy hospitals. This table gives values of the dependent variable Hours (monthly labor hours) and of the independent variables Xray (monthly X-ray exposures), BedDays (monthly occupied bed days— a hospital has one occupied bed day if one bed is occupied for an entire day), and Length (average length of patients’ stay, in days). The data in Table are part of a larger data set analyzed by the Navy. The complete data set includes two additional independent variables—Load (average daily patient load) and Pop (eligible population in the area, in thousands)—values of which are given in the page margin. Figure 14.30 gives Excel and MINITAB outputs of multicollinearity analysis and model building for the complete hospital labor needs data set.
a. (1) Find the three largest simple correlation coefficients among the independent variables and the three largest variance inflation factors in Figure 14.30(a) and (b). (2) Discuss why these statistics imply that the independent variables BedDays, Load, and Pop are most strongly involved in multicollinearity and thus contribute possibly redundant information for predicting Hours. Note that, although we have reasoned in Exercise 14.6(a) on page 535 that a negative coefficient (that is, least squares point estimate) for Length might be intuitively reasonable, the negative coefficients for Load and Pop [Figure 14.30(b)] are not intuitively reasonable and are a further indication of strong multicollinearity. We conclude that a final regression model for predicting Hours may not need all three of the potentially redundant independent variables BedDays, Load, and Pop.
b. Figure 14.30(c) indicates that the two best hospital labor needs models are the model using Xray, BedDays, Pop, and Length, which we will call Model 1, and the model using Xray, BedDays, and Length, which we will call Model 2. (1) Which model gives the smallest value of s and the largest value of? (2) Which model gives the smallest value of C? (3) Consider a questionable hospital for which Xray 56,194, BedDays 14,077.88, Pop 329.7, and Length 6.89. The 95 percent prediction intervals given by Models 1 and 2 for labor hours corresponding to this combination of values of the independent variables are, respectively, [14,888.43, 16,861.30] and [14,906.24, 16,886.26]. Which model gives the shortest prediction interval?
c. (1) Which model is chosen by stepwise regression in Figure 14.30(d)? (2) If we start with all five potential independent variables and use backward elimination with an αstay of .05, the procedure removes (in order) Load and Pop and then stops. Which model is chosen by backward elimination? (3) Overall, which model seems best? (4) Which of BedDays, Load, and Pop does this best model use?

  • CreatedMay 28, 2015
  • Files Included
Post your question