Collinearity is sometimes described as a problem with the data, not the model. Rather than filling the scatterplot of X1 on X2, the data concentrate along a diagonal. For example, the following plot shows monthly percentage changes in the whole stock market and the S&P 500 (in excess of the risk-free rate of return). The data span the same period considered in the text, running monthly from 1995 through 2011.

(a) Data for two months (February and March of 2000, identified as × in the plot) deviate from the pattern evident in other months. What makes these months unusual?

(b) If you were to use both returns on the market and those on the S&P 500 as explanatory variables in the same regression, would these two months be leveraged?

(c) Would you want to use these months in the regression or exclude them from the multiple regression?

## Answer to relevant Questions

