Suppose we are interested in the effect of education on salary as expressed in the following model:

Question:

\[ \text { Salary }_{i}=\beta_{0}+\beta_{1} \text { Education }_{i}+\epsilon_{i} \]

For this problem, we are going to assume that the true model is

\[ \text { Salary }_{i}=12,000+1,000 \text { Education }_{i}+\epsilon_{i} \]

The model indicates that the salary for each person is $\$ 10,000$ plus $\$ 1,000$ times the number of years of education plus the error term for the individual. Our goal is to explore how much our estimate of $\hat{\beta}_{1}$ varies.

The book's website provides code that will simulate a data set with 100 observations. (Stata code is in Ch3_SimulateBeta_StataCode.do; $\mathrm{R}$ code is in Ch3_SimulateBeta_StataCode.R.) Values of education for each observation are between 0 and 16 years. The error term will be a normally distributed error term with a standard deviation of 10,000 .

(a) Explain why the means of the estimated coefficients across the multiple simulations are what they are.

(b) What are the minimum and maximum values of the estimated coefficients on education? Explain whether these values are inconsistent with our statement in the chapter that OLS estimates are unbiased.

(c) Rerun the simulation with a larger sample size in each simulation. Specifically, set the sample size to 1,000 in each simulation. Compare the mean, minimum, and maximum of the estimated coefficients on education to the original results above.

(d) Rerun the simulation with a smaller sample size in each simulation. Specifically, set the sample size to 20 in each simulation. Compare the mean, minimum, and maximum of the estimated coefficients on education to the original results above.

(e) Reset the sample size to 100 for each simulation, and rerun the simulation with a smaller standard deviation (equal to 500) for each simulation. Compare the mean, minimum, and maximum of the estimated coefficients on education to the original results above.

(f) Keeping the sample size at 100 for each simulation, rerun the simulation with a larger standard deviation for each simulation. Specifically, set the standard deviation to 50,000 for each simulation. Compare the mean, minimum, and maximum of the estimated coefficients on education to the original results above.

(g) Revert to original model (sample size at 100 and standard deviation at 10,000$)$. Now run 500 simulations. Summarize the distribution of the $\hat{\beta}_{\text {Education }}$ estimates as you've done so far, but now also plot the distribution of these coefficients using code provided. Describe the density plot in your own words.