# Question

This table contains accounting and financial data that describe 324 companies operating in the information sector in 2010. The largest of these provide telephone services. One column gives the expenses on research and development (R&D), and another gives the total assets of the companies. Both columns are reported in millions of dollars. Use the logs of both variables rather than the originals. (That is, set Y to the natural log of R&D expenses, and set X to the natural log of assets. Note that the variables are recorded in millions, so 1,000 = 1 billion.)

(a) What problem with the use of the SRM is evident in the scatterplot of y on x as well as in the plot of the residuals from the fitted equation on x?

(b) If the residuals are nearly normal, of the values that lie outside the 95% prediction intervals, what proportion should be above the fitted equation?

(c) Based on the property of residuals identified in part (b), can you anticipate that these residuals are not nearly normal—without needing the normal quantile plot?

## Answer to relevant Questions

