Questions and Answers of Regression Analysis

Adjusted means (continued): The notion of an ‘‘adjusted’’ mean was introduced in Exercises 7.2 and 7.3. Consider the main-effects model for the p-way classification:µjk...r [ EðYijk...rÞ
ANOVA with equal cell frequencies: In higher-way ANOVA, as in twoway ANOVA, when cell frequencies are equal, the sum of squares for each set of effects can be calculated directly from the parameter
Calculate the fitted regression equation for each group (low and high partner’s status) in Moore and Krupat’s conformity data using the dummy regression in Equation 8.17(page 188). Calculate the
Adjusted means (concluded): The notion of an adjusted mean was discussed in Exercises 7.2, 7.3, and 8.6. Now consider the ANCOVA model for two factors, R and C, and two covariates, X1 and X2:Yijk ¼
Reanalyze Moore and Krupat’s conformity data eliminating the two outlying observations, Subjects 16 and 19. Perform both a two-way ANOVA, treating authoritarianism as a factor, and an ANCOVA,
(Equations 8.6 (page 167) show how the parameters in a dummy-coded twoway ANOVA model can be expressed in terms of the cell means µjk when the number of levels of the row factor R is r ¼ 2 and the
*Orthogonal contrasts (see Section 9.1.2): Consider the equation flF ¼ X(1 B „relating the parameters flF of the full-rank ANOVA model to the cell means „. Suppose that X(1 B is constructed so
Verify that each of the terms in the sum-of-squares function (see Equation 9.9 on page 208)SðbÞ ¼ y0 y ( y0 Xb ( b0 X0 y þ b0 X0 Xb is (1 · 1), justifying writing SðbÞ ¼ y0 y ( ð2y0 XÞb þ
Standardized regression coefficients:74(a) *Show that the standardized coefficients can be computed as b* ¼ R(1 XX rXy, where RXX is the correlation matrix of the explanatory variables, and rXy is
*A crucial step in the proof of the Gauss-Markov theorem (Section 9.3.2) uses the fact that the matrix product AX must be 0 because AXfl ¼ 0. Why is this the case? [Hint:The key here is that AXβ ¼
*For the statistic t ¼ Bj ( βj SE ffiffiffiffi vjj p to have a t-distribution, the estimators Bj and SE must be independent. [Here, vjj is the jth diagonal entry of ðX0 XÞ(1.] The coefficient Bj
*Using Equation 9.12 (page 214), show that the maximized likelihood for the linear model can be written as L ¼ 2πe e0 en& '(n=2
Using Duncan’s regression of occupational prestige on income and education, and performing the necessary calculations, verify that the omnibus null hypothesis H0:β1 ¼ β2 ¼ 0 can be tested as a
*Consider the model Yi ¼ β0 þ β1xi1 þ β2xi2 þ εi. Show that the matrix V(1 11(see Equation 9.16 on page 218) for the slope coefficients β1 and β2 contains mean deviation sums of squares and
*Show that Equation 9.20 (page 222) for the confidence interval for β1 can be written in the more conventional form B1 ( ta; n(3 SE ffiffiffiffiffiffiffiffiffiffiffiffiffiffi Px*2 i1 1 ( r2 12 s £
Using Figure 9.2 (on page 223), show how the confidence interval–generating ellipse can be used to derive a confidence interval for the difference of the parameters β1 ( β2.Compare the confidence
Prediction: One use of a fitted regression equation is to predict response-variable values for particular ‘‘future’’ combinations of explanatory-variable scores. Suppose, therefore, that we
Suppose that the model matrix for the two-way ANOVA model Yijk ¼ µ þ αj þ βk þ γjk þ εijk is reduced to full rank by imposing the following constraints (for r ¼ 2 rows and c ¼ 3
*Show that the equation-by-equation least-squares estimator Bb ¼ ðX0 XÞ(1 X0 Yis the maximum-likelihood estimator of the regression coefficients B in the multivariate general linear model Y ¼ XB
Intention to treat: Recall the imaginary example in Section 9.8.1 in which students were randomly provided vouchers to defray the cost of attending a private school. In the text, we imagined that the
*The asymptotic covariance matrix of the IV estimator is76 V ¼ 1 nplim nð Þ bIV ( fl ð Þ bIV ( fl 0 * +The IV estimator itself (Equation 9.28) can be written as bIV ¼ fl þ ðZ0 XÞ(1 Z0
Show that when the model matrix X is used as the IV matrix Z in instrumentalvariables estimation, the IV and OLS estimators and their covariance matrices coincide. See Equations 9.28 and 9.29 (on
Two-stage least-squares estimation:(a) Suppose that the column x1 in the model matrix X also appears in the matrix Z of instrumental variables in 2SLS estimation. Explain why bx1 in the first-stage
+The second principal component is w2ðn · 1Þ¼ A12z1 þ A22z2 þ###þ Ak2zk¼ ZXðn · kÞa2ðk · 1Þwith variance S2 W2 ¼ a0 2RXX a2 We need to maximize this variance subject to the normalizing
+Find the matrix A of principal-component coefficients when k ¼ 2 and r12 is negative. (Cf. Equation 13.6 on page 352.)
+Show that when k ¼ 2, the principal components of RXX correspond to the principal axes of the data ellipse for the standardized regressors Z1 and Z2; show that the halflength of each axis is equal
+Use the principal-components analysis of the explanatory variables in B. Fox’s time-series regression, given in Table 13.3, to estimate the nearly collinear relationships among the variables
+Show that Equation 13.7 (page 358) applied to the correlation matrix of the least-squares regression coefficients, computed from the coefficient covariance matrix S2 EðX0 XÞ'1, produces the same
Why are there 2k ' 1 distinct subsets of k explanatory variables? Evaluate this quantity for k ¼ 2, 3; ... ; 15.
Apply the backward, forward, and forward/backward stepwise regression methods to B. Fox’s Canadian women’s labor force participation data. Compare the results of these procedures with those shown
+Show that the ridge-regression estimator of the standardized regression coefficients, b+d ¼ ðRXX þ dIk Þ'1 rXy can be written as a linear transformation b+d ¼ Ub+ of the usual least-squares
+Show that the variance of the ridge estimator is Vðb+dÞ ¼ σ+2εn ' 1ðRXX þ dIk Þ'1 RXX ðRXX þ dIk Þ'1[Hint: Express the ridge estimator as a linear transformation of the standardized
+Finding the ridge constant d: Hoerl and Kennard suggest plotting the entries in b+d against values of d ranging between 0 and 1. The resulting graph, called a ridge trace, both furnishes a visual
Vary the span of the kernel estimator for the regression of prestige on income in the Canadian occupational prestige data. Does s ¼ 0:4 appear to be a reasonable choice?
Selecting the span by smoothing residuals: A complementary visual approach to selecting the span in local-polynomial regression is to find the residuals from the fit from the local regression, Ei ¼
Comparing the kernel and local-linear estimators: To illustrate the reduced bias of the local-linear estimator in comparison to the kernel estimator, generate n ¼ 100 observations of artificial data
*Bias, variance, and MSE as a function of bandwidth: Consider the artificial regression function introduced in the preceding exercise. Using Equation 18.1 (page 537), write down expressions for the
*Employing the artificial data generated in Exercise 18.3, use Equation 18.3 (on page 539) to compute the average squared error (ASE) of the local-linear regression estimator for various spans
*Continuing with the artificial data from Exercise (8.3), graph the crossvalidation function CVðsÞ and generalized cross-validation function GCVðsÞ as a function of span, letting the span range
Comparing polynomial and local regression:(a) The local-linear regression of prestige on income with span s ¼ 0:6 (in Figure 18.7 on page 543) has 5:006 equivalent degrees of freedom, very close to
Equivalent kernels: One way of comparing linear smoothers like local-polynomial estimators and smoothing splines is to think of them as variants of the kernel estimator, where fitted values arise as
*Prove that the median minimizes the least-absolute-values objective function:Xn i¼1 rLAVðEiÞ ¼ Xn i¼1 j j Yi % µb
Breakdown: Consider the contrived data set Y1 ¼ %0:068 Y2 ¼ %1:282 Y3 ¼ 0:013 Y4 ¼ 0:141 Y5 ¼ %0:980(an adaptation of the data used to construct Figure 19.1). Show that more than two values must
The following contrived data set (discussed in Chapter 3) is from Anscombe(1973):(a) Graph the data and confirm that the third observation is an outlier. Find the leastsquares regression of Y on X,
Computing the LTS estimator: Why is it almost surely the case that theðk þ 1Þ · ðk þ 1Þ matrix X*, with rows selected from among those of the complete model matrix X, is of full rank when all
In Chapter 15, I fit a Poisson regression of number of interlocks on assets, nation of control, and sector for Ornstein’s Canadian interlocking-directorate data. The results from this regression
Consider the following contrived data set for the variables X1, X2, and X3, where the question marks indicate missing data:(a) Using available cases (and recomputing the means and standard deviations
'In univariate missing data, where there are missing values for only one variable in a data set, some of the apparently distinct methods for handling missing data produce identical results for
'Duplicate the small simulation study reported in Table 20.2 on page 613, comparing several methods of handling univariate missing data that are MAR. Then repeat the study for missing data that are
'Equation 20.6 (on page 616) gives the ML estimators for the parameters µ1,µ2, σ2 1, σ2 2, and σ12 in the bivariate-normal model with some observations on X2 missing at random but X1 completely
'Multivariate linear regression fits the model Yðn · mÞ¼ Xðn · kþ1ÞBðkþ1 · mÞþ Eðn · mÞwhere Y is a matrix of response variables; X is a model matrix (just as in the univariate linear
'Consider once again the case of univariate missing data MAR for two bivariately normal variables, where the first variable, X1; is completely observed, and m observations(for convenience, the first
As explained in Section 20.4.1, the efficiency of the multiple-imputation estimator of a coefficient βej relative to the ML estimator bβj is REðβejÞ ¼ g=ðg þ γjÞ, where g is the number of
Examine the United Nations data on infant mortality and other variables for 207 countries, discussed in Section 20.4.4.(a) Perform a complete-case linear least-squares regression of infant mortality
Truncated normal distributions:(a) Suppose that j ; Nð0; 1Þ. Using Equations 20.14 (page 630) for the mean and variance of a left-truncated normal distribution, calculate the mean and variance of j
'Suppose that j ; Nðµ; σ2Þ is left-censored at j ¼a, so that Y ¼ a for j £ a j for j > a, Using Equations 20.14 (on page 630) for the truncated normal distribution, show that (repeating
'Equations 20.16 (on page 631) give formulas for the mean and variance of a left-censored normally distributed variable. (These formulas are also shown in the preceding exercise.) Derive similar
'Using Equations 20.17 (page 631) for the incidentally truncated bivariate-normal distribution, show that the expectation of the error εi in the Heckman regression model(Equations 20.18 and 20.19 on
'As explained in the text, the Heckman regression model (Equations 20.18 and 20.19, page 632) implies thatðYijζi > 0Þ ¼ α þ β1Xi1 þ β2Xi2 þ***þ βkXik þ βλλi þ ni where βλ [
'The log-likelihood for the Heckman regression-selection model is given in Equation 20.20 (page 634). Derive this expression. (Hint: The first sum in the log-likelihood, for the observations for
'Explain how White’s coefficient-variance estimator (see Section 12.2.3), which is used to correct the covariance matrix of OLS regression coefficients for heteroscedasticity, can be employed to
Greene (2003 p. 768) remarks that the ML estimates βbj of the regression coefficients in a censored-regression model are often approximately equal to the OLS estimates Bj divided by the proportion P
'Test the omnibus null hypothesis H0: β1 ¼ β2 ¼ 0 for the Huber M estimator in Duncan’s regression of occupational prestige on income and education.(a) Base the test on the estimated asymptotic
Case weights:(a) 'Show how case weights can be used to ‘‘adjust’’ the usual formulas for the leastsquares coefficients and their covariance matrix. How do these case-weighted formulas compare
'Bootstrapping time-series regression: Bootstrapping can be adapted to timeseries regression but, as in the case of fixed-X resampling, the procedure makes strong use of the model fit to the
'Prove that Mallows’s Cp statistic, Cpj ¼ RSSj S2 Eþ 2sj # n can also be written Cpj ¼ ðk þ 1 # sjÞðFj # 1Þ þ sj where RSSj is the residual sum of squares for model Mj; sj is the number of
Both the adjusted R2, Re2 ¼ 1 # n # 1 n # s·RSS TSS and the generalized cross-validation criterion GCV ¼ n · RSS ð Þ n # s 2 penalize models that have large numbers of predictors. (Here, n is
Show that the differences in BIC values given in the first column of Table 22.1(page 680) correspond roughly to the Bayes factors and posterior model probabilities given in columns 2 and 3 of the
Perform model selection for the baseball salary regression using a criterion or criteria different from the BIC, examining the ‘‘best’’ model of each size, and the ‘‘best’’ 10 or 15
Using the estimated fixed effects in the table on page 717 for the model fit to the High School and Beyond data, find the fixed-effect regression equations for typical low, medium, and high mean SES
*BLUPs: As discussed in Section 23.8, show that for the random-effects oneway ANOVA model, Yij ¼ β1 þ δ1i þ εij, the weights wi ¼ ni minimize the variance of the estimatorβb1 ¼Pm Pi¼1
*Prove that the least-squares estimates of the coefficient β2 for Xij is the same in the following two fixed-effects models (numbered as in Section 23.7.1):Recall the context: The data are divided
*Using V –"( ) ¼ c* 0 0 σ2εL( )show that the covariance matrix of the response variable in the compact form of the LMM, y ¼ Xfl þ Z– þ ", can be written as VðyÞ ¼ Zc*Z0 þ σ2εL.58
*Show that the log-likelihood for the variance-covariance-component parameters ! given the fixed effects fl can be written as (repeating Equation 23.22 from page 737)loge Lð!jfl; yÞ¼& n
Further on migraine headaches:(a) A graph of the fixed effects for the mixed-effects logit model fit to the migraine headaches data is shown in Figure 24.1 (page 747), and the estimated parameters of
Further on recovery from coma:(a) The example in Section 24.2.1 on recovery from coma uses data on performance IQ.The original analysis of the data by Wong et al. (2001) also examined verbal IQ.
'Show that the correlation between the least-squares residuals Ei and the response-variable values Yi is 1 " R2 p . [Hint: Use the geometric vector representation of multiple regression (developed in
Nonconstant variance and specification error: Generate 100 observations according to the following model:Y ¼ 10 þ ð1 · XÞþð1 · DÞþð2 · X · DÞ þ εwhere ε ; Nð0; 102Þ; the values of
'Weighted-least-squares estimation: Suppose that the errors from the linear regression model y ¼ X fl þ " are independent and normally distributed, but with different variances, εi ; Nð0; σ2 i
'Show that when the covariance matrix of the errors is S ¼ σ2ε · diagf1=W2 1 ; ... ; 1=W2 n g [ σ2εW"1 the weighted-least-squares estimator flb ¼ ðX0 WXÞ"1 X0 Wy¼ My is the minimum-variance
'The impact of nonconstant error variance on OLS estimation: Suppose that Yi ¼ α þ βxi þ εi, with independent errors, εi ; Nð0; σ2 i Þ, and σi ¼ σεxi. Let B represent the OLS estimator
Experimenting with component-plus-residual plots: Generate random samples of 100 observations according to each of the following schemes. In each case, construct the component-plus-residual plots for
Consider an alternative analysis of the SLID data in which log wages is regressed on sex, transformed education, and transformed age—that is, try to straighten the relationship between log wages
Apply Mallows’s procedure to construct augmented component-plus-residual plots for the SLID regression of log wages on sex, age, and education. *Then apply Cook’s CERES procedure to this
*Figure 2.7 illustrates how, when the relationship between Y and X is nonlinear in an interval, the average value of Y in the interval can be a biased estimate of EðYjxÞat the center of the
Create a graph like Figure 4.1, but for the ordinary power transformations X ! Xp for p ¼ #1; 0; 1; 2; 3. (When p ¼ 0, however, use the log transformation.) Compare your graph to Figure 4.1, and
&Show that the derivative of f ðXÞ¼ðXp # 1Þ=p is equal to 1 at X ¼ 1 regardless of the value of p.
&We considered starts for transformations informally to ensure that all data values are positive and that the ratio of the largest to the smallest data values is sufficiently large.An alternative is
The Yeo-Johnson family of modified power transformations (Yeo & Johnson, 2000) is an alternative to using a start when both negative (or 0) and positive values are included in the data. The
'Prove that the least-squares fit in simple-regression analysis has the following properties:(a) P YbiEi ¼ 0.(b) PðYi & YbiÞðYbi & YÞ ¼ PEiðYbi & YÞ ¼ 0.
'Suppose that the means and standard deviations of Y and X are the same:Y ¼ X and SY ¼ SX .(a) Show that, under these circumstances, BYjX ¼ BX jY ¼ rXY where BYjX is the least-squares slope for
'Show that A0 ¼ Y minimizes the sum of squares SðA0Þ ¼ Xn i¼1ðYi & A0Þ2
Linear transformation of X and Y:(a) Suppose that the explanatory-variable values in Davis’s regression are transformed according to the equation X0 ¼ X & 10 and that Y is regressed on X0. Without
'Derive the normal equations (Equations 5.7) for the least-squares coefficients of the general multiple-regression model with k explanatory variables. [Hint: Differentiate the sum-of-squares function
Why is it the case that the multiple-correlation coefficient R2 can never get smaller when an explanatory variable is added to the regression equation? [Hint: Recall that the regression equation is
Consider the general multiple-regression equation Y ¼ A þ B1X1 þ B2X2 þ***þ BkXk þ E An alternative procedure for calculating the least-squares coefficient B1 is as follows:1. Regress Y on X2
Partial correlation: The partial correlation between X1 and Y ‘‘controlling for’’X2 through Xk is defined as the simple correlation between the residuals EYj2 ... k and E1j2 ... k , given in
'Show that in simple-regression analysis, the standardized slope coefficient B' is equal to the correlation coefficient r. (In general, however, standardized slope coefficients are not correlations
+Demonstrate the unbias of the least-squares estimators A and B of α and β in simple regression:(a) Expressing the least-squares slope B as a linear function of the observations, B ¼ PmiYi (as in
+Using the assumptions of linearity, constant variance, and independence, along with the fact that A and B can each be expressed as a linear function of the Yis, derive the sampling variances of A

Showing 1 - 100 of 785