# Question

The Bell Curve (New York: Free Press, 1994) by Richard Herrnstein and Charles Murray (H&M), a controversial book about race, genes, IQ, and economic mobility. The book heavily employs statistics and statistical methodology in an attempt to support the authors’ positions on the relationships among these variables and their social consequences. The main theme of The Bell Curve can be summarized as follows:

1. Measured intelligence (IQ) is largely genetically inherited.

2. IQ is correlated positively with a variety of socioeconomic status success measures, such as a prestigious job, a high annual income, and high educational attainment.

3. From 1 and 2, it follows that socioeconomic successes are largely genetically caused and therefore resistant to educational and environmental interventions (such as affirmative action).

The statistical methodology (regression) employed by the authors and the inferences derived from the statistics were critiqued in Chance (Summer 1995) and The Journal of the American Statistical Association (Dec. 1995). The following are just a few of the problems with H&M’s use of regression that have been identified:

Problem 1 H&M consistently use a trio of independent variables—IQ, socioeconomic status, and age—in a series of first-order models designed to predict dependent social outcome variables such as income and unemployment. (Only on a single occasion are interaction terms incorporated.) Consider, for example, the model

E(y) = β0 + β1x1 + β2x2 + β3x3

where y = income, x1 = IQ, x2 = socioeconomic status, and x3 = age. H&M utilize t –tests on the individual b parameters to assess the importance of the independent variables. As with most of the models considered in The Bell Curve, the estimate of b1 in the income model is positive and statistically significant at a = .05, and the associated t-value is larger (in absolute value) than the t -values associated with the other independent variables. Consequently, H&M claim that IQ is a better predictor of income than the other two independent variables. No attempt was made to determine whether the model was properly specified or whether the model provides an adequate fit to the data.

Problem 2 In an appendix, the authors describe multiple regression as a “mathematical procedure that yields coefficients for each of [the independent variables], indicating how much of a change in [the dependent variable] can be anticipated for a given change in any particular [independent] variable, with all the others held constant.” Armed with this information and the fact that the estimate of b1 in the model just described is positive, H&M infer that a high IQ necessarily implies (or causes) a high income, and a low IQ inevitably leads to a low income. (Cause-and-effect inferences like this are made repeatedly throughout the book.)

Problem 3 The title of the book refers to the normal distribution and its well-known “bell-shaped” curve. There is a misconception among the general public that scores on intelligence tests (IQS) are normally distributed. In fact, most IQ scores have distributions that are decidedly skewed. Traditionally, psychologists and psychometricians have transformed these scores so that the resulting numbers have a precise normal distribution. H&M make a special point to do this. Consequently, the measure of IQ used in all the regression models is normalized (i.e., transformed so that the resulting distribution is normal), despite the fact that regression methodology does not require predictor (independent) variables to be normally distributed.

Problem 4 A variable that is not used as a predictor of social outcome in any of the models in The Bell Curve is level of education. H&M purposely omit education from the models, arguing that IQ causes education, not the other way around. Other researchers who have examined H&M’s data report that when education is included as an independent variable in the model, the effect of IQ on the dependent variable (say, income) is diminished.

a. Comment on each of the problems identified. Why do these problems cast a shadow on the inferences made by the authors?

b. Using the variables specified in the model presented, describe how you would conduct the multiple-regression analysis. (Propose a more complicated model and describe the appropriate model tests, including a residual analysis.)

1. Measured intelligence (IQ) is largely genetically inherited.

2. IQ is correlated positively with a variety of socioeconomic status success measures, such as a prestigious job, a high annual income, and high educational attainment.

3. From 1 and 2, it follows that socioeconomic successes are largely genetically caused and therefore resistant to educational and environmental interventions (such as affirmative action).

The statistical methodology (regression) employed by the authors and the inferences derived from the statistics were critiqued in Chance (Summer 1995) and The Journal of the American Statistical Association (Dec. 1995). The following are just a few of the problems with H&M’s use of regression that have been identified:

Problem 1 H&M consistently use a trio of independent variables—IQ, socioeconomic status, and age—in a series of first-order models designed to predict dependent social outcome variables such as income and unemployment. (Only on a single occasion are interaction terms incorporated.) Consider, for example, the model

E(y) = β0 + β1x1 + β2x2 + β3x3

where y = income, x1 = IQ, x2 = socioeconomic status, and x3 = age. H&M utilize t –tests on the individual b parameters to assess the importance of the independent variables. As with most of the models considered in The Bell Curve, the estimate of b1 in the income model is positive and statistically significant at a = .05, and the associated t-value is larger (in absolute value) than the t -values associated with the other independent variables. Consequently, H&M claim that IQ is a better predictor of income than the other two independent variables. No attempt was made to determine whether the model was properly specified or whether the model provides an adequate fit to the data.

Problem 2 In an appendix, the authors describe multiple regression as a “mathematical procedure that yields coefficients for each of [the independent variables], indicating how much of a change in [the dependent variable] can be anticipated for a given change in any particular [independent] variable, with all the others held constant.” Armed with this information and the fact that the estimate of b1 in the model just described is positive, H&M infer that a high IQ necessarily implies (or causes) a high income, and a low IQ inevitably leads to a low income. (Cause-and-effect inferences like this are made repeatedly throughout the book.)

Problem 3 The title of the book refers to the normal distribution and its well-known “bell-shaped” curve. There is a misconception among the general public that scores on intelligence tests (IQS) are normally distributed. In fact, most IQ scores have distributions that are decidedly skewed. Traditionally, psychologists and psychometricians have transformed these scores so that the resulting numbers have a precise normal distribution. H&M make a special point to do this. Consequently, the measure of IQ used in all the regression models is normalized (i.e., transformed so that the resulting distribution is normal), despite the fact that regression methodology does not require predictor (independent) variables to be normally distributed.

Problem 4 A variable that is not used as a predictor of social outcome in any of the models in The Bell Curve is level of education. H&M purposely omit education from the models, arguing that IQ causes education, not the other way around. Other researchers who have examined H&M’s data report that when education is included as an independent variable in the model, the effect of IQ on the dependent variable (say, income) is diminished.

a. Comment on each of the problems identified. Why do these problems cast a shadow on the inferences made by the authors?

b. Using the variables specified in the model presented, describe how you would conduct the multiple-regression analysis. (Propose a more complicated model and describe the appropriate model tests, including a residual analysis.)

## Answer to relevant Questions

Use Table VII of Appendix A to find the following probabilities: a. P(x2≤ 1.063623) for df = 4 b. P(x2 > 30.5779) for df = 15 c. P(x2 ≥ 82.3581) for df = 100 d. P(x2 < 18.4926) for df = 30 Refer to the Teaching Sociology (July 2006) study of the fieldwork methods used by qualitative sociologists, presented in Exercise. Recall that fieldwork methods can be categorized as follows: Interview, Observation plus ..."Frontier medicine" is a term used to describe medical therapies (e.g., energy healing, therapeutic prayer, spiritual healing) for which there is no plausible explanation. The Lancet (July 16, 2005) published the results of ...A random sample of 150 observations was classified into the categories shown in the following table: a. Do the data provide sufficient evidence that the categories are not equally likely? Use a = .10. b. Form a 90% ...In order to evaluate their situational awareness, fighter aircraft pilots participate in battle simulations. At a random point in the trial, the simulator is frozen and data on situational awareness are immediately ...Post your question

0