Question: Data Collection and Cleaning Lets assume that you get a new job at a real estate brokerage firm. You decide to apply analytics to the
Data Collection and Cleaning
Lets assume that you get a new job at a real estate brokerage firm. You decide to apply analytics to the housing market data to give better guidance to your Realtors so they can make marketing decisions more efficiently. The work will help you to analyze the factors which will affect the purchase price of your customers dream house. However, you will also want to create a professionally authored document explaining your findings to your boss.
Part 1: Data Collection
Collect and code the data: We are going to collect data for your groups assigned city/area.
Go to a website providing real estate data (for example, realtor.com, zillow.com) and choose single family homes (not an apartment, condo, lots). Randomly select 70 houses from the area in or around the City you have been assigned (your group name in Blackboard). Save the data you collect in a spreadsheet (recommend each group member collect some, then combine), project1_5880_ teamName.xlsx following the data descriptions below.
Use the following variable description to collect the data.
price = purchase price in $1000s
bdrms = number of bedrooms
bths = number of bathrooms
sqft = size of the house in square feet
age = age of the house (current year year built)
pool = 1 if house has a swimming pool, 0 otherwise
garage = 1 if the house has a garage, 0 otherwise
city = 1 if the house is located within the City limits, 0 otherwise
From the data, you may see the observations such as:
Obesrvation 1:
4 beds, 3 baths, 2762 square feet
Built in 1998
No swimming pool
2 car garage
Price: $364,900
Address: XXXX, City, Alabama
Observation 2:
3 beds, 3 baths, 3100 square feet
Built in 2005
Swimming Pool
1 car garage
Price: $400,550
Address: XXXX, City 2, Alabama
Each observation would then be coded into the spreadsheet. For observation 1, pool=0 and garage=1, and so on.
Part 2:
Create a properly encoded and cleaned dataset in an Excel spreadsheet on Sheet 1 (labeled data).
- Code each variable as appropriate (e.g. 1 or 0).
- If the address is in your specific City, create an IF structure to code a variable InCity to 1. Otherwise, this should be 0.
- On sheet 2 (labeled "data dictionary"), provide information about each variable.
- Categorize each variable.
- Describe each variable.
- On Sheet 2 (labeled descriptives), provide the following:
- Produce MIN, MAX, MEAN of each variable (where that makes sense).
- Produce boxplots of price, square feet variables (new worksheet if necessary).
- Produce bubble charts of price with the other variables.
- Use the data visualization tools to color code or use data bars on price and square feet.
- Produce scatter charts where you believe they are appropriate between variables (new worksheet if necessary).
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
