Question: Several statistics are commonly used to detect nonnormality in underlying population distributions. Here we will study one that measures the amount of skewness in a
Several statistics are commonly used to detect nonnormality in underlying population distributions.
Here we will study one that measures the amount of skewness in a distribution. Recall that any normally distributed random variable is symmetric about its mean; therefore, if we standardize a symmetrically distributed random variable, say z 5 1y 2 my 2/sy, where my 5 E1y2 and sy 5 sd1y2, then z has mean zero, variance one, and E1z 3 2 5 0. Given a sample of data 5yi
: i 5 1, p, n6, we can standardize yi in the sample by using zi 5 1yi 2 m^ y 2/s^ y, where m^ y is the sample mean and s^ y is the sample standard deviation. (We ignore the fact that these are estimates based on the sample.) A sample statistic that measures skewness is n21gn i51z 3
i , or where n is replaced with (n 21) as a degrees-of-freedom adjustment. If y has a normal distribution in the population, the skewness measure in the sample for the standardized values should not differ significantly from zero.
(i) First use the data set 401KSUBS, keeping only observations with fsize 5 1. Find the skewness measure for inc. Do the same for log(inc). Which variable has more skewness and therefore seems less likely to be normally distributed?
(ii) Next use BWGHT2. Find the skewness measures for bwght and log(bwght). What do you conclude?
(iii) Evaluate the following statement: “The logarithmic transformation always makes a positive variable look more normally distributed.”
(iv) If we are interested in the normality assumption in the context of regression, should we be evaluating the unconditional distributions of y and log(y)? Explain.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
