Question: 1 . ( 3 5 points ) The variable smokes is a binary variable equal to one if a person smokes, and zero otherwise. Using

1.(35 points) The variable smokes is a binary variable equal to one if a person smokes, and zero otherwise. Using the data in SMOKE, we estimate a linear probability model for smokes: Using data we estimate the following OLS regression. dsmokes =.656
(.855)[.856].069(.204)[.207] log(cigprice)+.012(..026)[.026] log(income).029(.006)[.006] educ +.020(.006)[.005] age .00026(.00006)[.00006] age2.101
(.039)[.038] restaurn .026(.052)[.050] white
n =807, R2=0.062
where
cigprice = the per pack price of cigarettes (in cents).
white = equals one if the respondent is white, and zero otherwise.
income = annual income.
educ = years of schooling.
age = age measured in years.
restaurn = a binary indicator equal to unity if the person resides in a state with restaurant smoking restrictions.
Both the usual and heteroskedasticity-robust standard errors are reported.
(a) Are there any important differences between the two sets of standard errors?
(b) Holding other factors fixed, if a person resides in a state with restaurant smoking restrictions how
does this affect the probability of them smoking?
(c) Person number 206 in the data set has the following characteristics: cigpric =67.44, income =6,500,
educ =12, age =77, restaurn =0, white =0, and smokes =0. Compute the predicted probability of smoking for this person and comment on the result.
2.(15 points) Consider a linear model to explain GPA:
GP A = 0+ 1study + 2sleep + 3work + u
E(u|study, sleep, work)=0
E(var|study, sleep, work)= 2 study2
Write the transformed equation that has a homoskedastic error term.
(These following questions are computer questions and are to be answered in RStudios coding)
Use the data in PNTSPRD for this exercise.
(a) The variable sprdcvr is a binary variable equal to one if the Las Vegas point spread for a college basketball game was covered. The expected value of sprdcvr, say , is the probability that the spread is covered in a randomly selected game.
Test H0 : =.5 against H1 : =.5 at the 10% significance level and discuss your findings. (Hint: This is easily done using a t test by regressing sprdcvr on an
intercept only.)
(b) How many games in the sample of 553 were played on a neutral court?
(c) Estimate the linear probability model
sprdcvr = 0+ 1f avhome + 2neutral + 3f av25+ 4und25+ u
and report the results in the usual form. (Report the usual OLS standard errors and the heteroskedasticity- robust standard errors.) Which variable is most significant, both practically and statistically?
2. Use the data in KIELMC for this exercise. Reference the functions I uploaded for this
assignment for an example.
(a) Provide a table with summary statistics for this data set. Use sumtable function. If your code does not output, comment out the code. This should still work when you knit to an html file.
(b) Using ggplot, create a scatter plot with log price on the vertical axis and log distance on the horizontal axis. On your scatter plot, make the observations that are from 1981 a different color than those from 1978.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

Question a Differences Between Standard Errors We are provided both usual standard errors and heteroskedasticityrobust standard errors Heres a compari... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!