Referring to the box plots introduced in Chapter 2, the sides of the “box” are at the first and third quartiles and the difference between these (the length of the box) is called the inter-quartile range (IQR). A mild outlier is an observation that is between 1.5 and 3 IQRs from the box, and an extreme outlier is an observation that is more than 3 IQRs from the box.
a. If the data are normally distributed, what percentage of values will be mild outliers? What percentage will be extreme outliers? Why don’t the answers depend on the mean and/or standard deviation of the distribution?
b. Check your answers in part a with simulation. Simulate a large number of normal random numbers, and count the number of mild and extreme outliers with appropriate IF functions. Do these match, at least approximately, your answers to part a?

