Question: Consider regression model y = Bo+ 2j_ Box, +e with the data given by t (dat) ## [, 1] [, 2] [,3] [, 4] [,5]

Consider regression model y = Bo+ 2j_ Box, +e with the data given by t (dat) ## [, 1] [, 2] [,3] [, 4] [,5] [,6] [,7] [,8] [,9] [, 10] ## X1 0.0 -1 0.0 -1 -1 0 1.0 -1.0 ## X2 -1.0 0 1.0 1 0 -1 0.0 0.0 OOH ## X3 -1.0 1 1.0 -1 1 1 1.0 1.0 NOON ## X4 -1.0 -1 1.0 -1 0 0 -1.0 0.0 ## y -0.7 -1 -0.1 1 -1 -4 -0.8 -0.3 -3 Under the standard assumption & ~ N(0, o'In), answer the following: (a) Which of the 10 data points has the highest influence on the regression, based on the Cook's distance? (b) Which of the 10 data points has the highest leverage without having much of an influence on the regression? (c) Using the PRESS statistic, among the the following two models, which one should be selected: (1) the model that includes x, and T2 or (2) the model that includes x3 and x4? Both models also include the intercept. (d) Is it reasonable to use R2 to select among the two models in part (c)? Justify your answer. (e) Argue that the comparison based on R" in part (d) is equivalent to comparing the p-values of two F-tests. Specify those tests
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
