# Question

Explain what’s wrong with the way regression is used in each of the following examples:

a. Winning times in the Boston marathon (at www.bostonmarathon.org) have followed a straight line decreasing trend from 160 minutes in 1927 (when the race was first run at the Olympic distance of about 26 miles) to 130 minutes in 2004. After fitting a regression line to the winning distances, you use the equation to predict that the winning time in the year 2300 will be about 13 minutes.

b. Using data for several cities on x = % of residents with a college education and y = median price of home, you get a strong positive correlation. You conclude that having a college education causes you to be more likely to buy an expensive house.

c. A regression between x = number of years of education and y = annual income for 100 people shows a modest positive trend, except for one person who dropped out after 10th grade but is now a multimillionaire. It’s wrong to ignore any of the data, so we should report all results including this point. For this data, the correlation r = -0.28.

## Answer to relevant Questions

