points The variable smokes is a binary variable equal to one if a person smokes, and zero otherwise. Using the data in SMOKE, we estimate a linear probability model for smokes: Using data we estimate the following OLS regression. dsmokes
logcigprice logincome educ age age
restaurn white
n R
where
cigprice the per pack price of cigarettes in cents
white equals one if the respondent is white, and zero otherwise.
income annual income.
educ years of schooling.
age age measured in years.
restaurn a binary indicator equal to unity if the person resides in a state with restaurant smoking restrictions.
Both the usual and heteroskedasticityrobust standard errors are reported.
a Are there any important differences between the two sets of standard errors?
b Holding other factors fixed, if a person resides in a state with restaurant smoking restrictions how
does this affect the probability of them smoking?
c Person number in the data set has the following characteristics: cigpric income
educ age restaurn white and smokes Compute the predicted probability of smoking for this person and comment on the result.
points Consider a linear model to explain GPA:
GP A study sleep work u
Eustudy sleep, work
Evarstudy sleep, work study
Write the transformed equation that has a homoskedastic error term.
These following questions are computer questions and are to be answered in RStudios coding
Use the data in PNTSPRD for this exercise.
a The variable sprdcvr is a binary variable equal to one if the Las Vegas point spread for a college basketball game was covered. The expected value of sprdcvr say is the probability that the spread is covered in a randomly selected game.
Test H : against H : at the significance level and discuss your findings. Hint: This is easily done using a t test by regressing sprdcvr on an
intercept only.
b How many games in the sample of were played on a neutral court?
c Estimate the linear probability model
sprdcvr f avhome neutral f av und u
and report the results in the usual form. Report the usual OLS standard errors and the heteroskedasticity robust standard errors. Which variable is most significant, both practically and statistically?
Use the data in KIELMC for this exercise. Reference the functions I uploaded for this
assignment for an example.
a Provide a table with summary statistics for this data set. Use sumtable function. If your code does not output, comment out the code. This should still work when you knit to an html file.
b Using ggplot, create a scatter plot with log price on the vertical axis and log distance on the horizontal axis. On your scatter plot, make the observations that are from a different color than those from