Question: Consider a data set consisting of n observations, n c complete and n m incomplete, for which the dependent variable, y i , is missing.
Consider a data set consisting of n observations, nc complete and nm incomplete, for which the dependent variable, yi, is missing. Data on the independent variables, xi, are complete for all n observations, Xc and Xm. We wish to use the data to estimate the parameters of the linear regression model y = Xβ + ε. Consider the following the imputation strategy: Step 1: Linearly regress yc on Xc and compute bc. Step 2: Use Xm to predict the missing ym with Xmbc. Then regress the full sample of observations, (yc,Xmbc), on the full sample of regressors, (Xc,Xm).
a. Show that the first and second step least squares coefficient vectors are identical.
b. Is the second step coefficient estimator unbiased?
c. Show that the sum of squared residuals is the same at both steps.
d. Show that the second step estimator of σ2 is biased downward.
Step by Step Solution
3.34 Rating (154 Votes )
There are 3 Steps involved in it
a To solve this we will use an extension of Exercise 5 in Chapter 3 adding one row of d... View full answer
Get step-by-step solutions from verified subject matter experts
Document Format (2 attachments)
1619_60641d70626f5_700188.pdf
180 KBs PDF File
1619_60641d70626f5_700188.docx
120 KBs Word File
