A study was conducted to determine whether infection surveillance and control pro-grams have reduced the rates of hospital- acquired infection in U. S. hospitals. This data set consists of a random sample of 28 hospitals selected from 338 hospitals participating in a larger study. Each line of the data set provides information on variables for a single hospital. The variables are as follows:
RISK 5 output variable, average estimated probability of acquiring infection in hospital (in percent)
STAY = input variable, average length of stay of all patients in hospital ( in days)
AGE = input variable, average age of patients (in years)
INS = input variable, ratio of number of cultures performed to number of patients without signs or symptoms of hospital- acquired infection (times 100)
SCHOOL = dummy input variable for medical school affiliation, 1 = yes, 0 = no
RC1 = dummy input variable for region of country, 1 = northeast, 0 = other
RC2 = dummy input variable for region of country, 1 = north central, 0 = other
RC3 = dummy input variable for region of country, 1 = south, 0 = other
The data were analyzed using SAS with the following results.
Does the set of seven input variables contain information about the output variable, RISK? Give a p- value for your test.
Based on the full regression model (seven input variables), can we be at least 95% certain that hospitals in the south have at least .5% higher risk of infection than hospitals in the west, all other things being equal?
