Question: Junction is a small town with two suburbs. The data file Major Project - Data Set contains data on 540 houses sold in Junction between

Junction is a small town with two suburbs. The data file "Major Project - Data Set" contains data on 540 houses sold in Junction between 2017 and 2022. This data includes the price at which the house was sold, which of two agents sold the house (all houses are sold through an agent by law), the year in which the house was sold as well as data on various characteristics of each house sold (age, size, number of stories etc.). These characteristics serve as possible explanatory variables of sale price. Data definitions follow:

OBS = observation

AGE = age of house in years

SHOPS = 1 if house is close to a shopping precinct, 0 otherwise

CRIME = crime rate of the suburb within which the house is located

TOWN = distance in kilometres to the town centre

STORIES = number of dwelling stories

OCEAN = 1 if house has an ocean view, 0 otherwise

POOL = 1 if house has a pool, 0 otherwise

PRICE = price at which the house was sold (in dollars)

AGENT = selling agent - "W&M" (0) or "A&B" (1)

SIZE = size of the house in square metres

SUBURB = Mayfair (0) or Claygate (1)

TENNIS = 1 if house has a tennis court, 0 otherwise

SOLD = year of last sale (2016 to 2021)

Task 1

You are required to provide a comprehensive summary of the data set contained in the "Major Project - Data Set" file. How you choose is entirely at your discretion. However, it is recommended that you consider using both summary statistic and graphical methods while also noting any peculiarities within the data set.

Task 1 directed you to take note of any peculiarities in the data set. There are other additional errors in the data set that you may not have picked up on in Task 1. These will only become clear to you once you start working on Task 2. Several problems can result if you fail to handle these issues correctly, so be mindful to address them, both in your regression application as well as your final report. If resolving any of the errors in the dataset requires you to make assumptions, make sure to clearly state your reasoning and approach in your report.

Note: We have noted that there is an error in our data set for the SHOPS variable. This variable should be a dummy variable, which has a value of 0 or 1, indicates that whether the house is close to a shopping precinct. However, the shop variable has values of 0,1,2 and 3 in the data set. Thus, we have removed the values of 2 and 3 by recoding back to the value of 0 or 1.

HOW WILL WE RECODE THE DATA BACK TO THE VALUE OF 0 OR 1?

Excel data for the project

https://1drv.ms/x/s!AkCrFp_7u6LZizxkZpskccGjI5UA

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Economics Questions!