Question: (i) Create a variable, sqft total, inside of your data frame, equal to the sum of sqft above and sqft basement. (ii) Run a regression
(i) Create a variable, sqft total, inside of your data frame, equal to the sum of sqft above and sqft basement.
(ii) Run a regression of price on sqft total. Store the results in an object named
ols price sqft total. How do you interpret the coefficient on sqft total? Is it a rea- sonable estimate?
(iii) Does it make sense to run a regression of price on sqft total, sqft above, and sqft basement? Why or why not?
(iv) What is the sample correlation coefficient for sqft total and floors? Hint: the cor.test() function in R provides a sample correlation between two variables, along with a test for whether the correlation is statistically significantly different than zero. Our standard correlation measure from class, , is also called the Pearson correlation coefficient.
(v) Overwrite your previous log bedrooms variable by running the following line of code houses$log bedrooms=log(houses$bedrooms+1) Rerun the regression of price on log bedrooms. Store the results in an object named ols price log bedrooms. How do you interpret the coefficient on log bedrooms? Is it a reasonable estimate?
(vi) Create a new variable, log price, inside of your data frame, equal to the log() of price. Run a regression of log price on log bedrooms (from (iv)) and store the results in an object named ols log price log bedrooms. How do you interpret the coefficient on log bedrooms? Is it a reasonable estimate?
(vii) Based on your results, do you prefer the log specifications in (iv) and (v) or the level (not logged) specifications in (i) and (ii)? Explain.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
