Question: n this exam you will work with a simulated housing dataset to test whether rural house prices went up faster than non - rural house

n this exam you will work with a simulated housing dataset to test whether rural house prices went up faster than non-rural house prices after 2020, due to the pandemic. You will be using a diff-in-diff analysis while at the same time discovering how factors like square footage and number of bedrooms affect house price.
There are two datasets:
sales.csv: A dataset where each row is a house sale. The variables are:
sale_id: A unique identifier for the sale
sale_price: The price at which the house was sold
sale_year: The year in which the house was sold
zipcode: The zipcode of the house
sq_ft: The square footage of the house
lot_sq_ft: The square footage of the lot
num_bedrooms: The number of bedrooms in the house
num_bathrooms: The number of bathrooms in the house
built_before_1977: Whether the house was built before 1977
has_garage: Whether the house has a garage
has_view: Whether the house has a nice view
zips.csv: A dataset where each row is a zipcode. The variables are:
zipcode: The zipcode
is_rural: Whether or not the zipcode is rural
There are four tasks.
Task 1
Read both data files into R and merge them. Name the resulting dataframe df.
Task 2
Create three new variables in df:
treated: A boolean variable which is True if the house is in a rural zipcode
post: A boolean variable which is True if the sale year is >=2020
treatedXpost: Equal to treated*post
Task 3
Make a diff-in-diff plot showing the average log sale price by year for the treated (rural) and untreated (non-rural) groups.
The plot title should be "Average Log Sale Price by Year, Rural vs. Non-Rural"
The X axis should be labeled "Year"
The Y axis should be labeled "Avg Log Sale Price"
Task 4
Run one regression with log(sale_price) as the dependent variable. Include the following variables as covariates:
log(sq_ft)
log(lot_sq_ft)
num_bedrooms
num_bathrooms
built_before_1977
has_garage
has_view
treatedXpost
Additionally, include sale_year and zipcode as dummy variables
Report the regression results using Stargazer. (The table should only have one column).
Interpret the results
In addition, you must answer a question asking you to interpret the results (3 pts)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!