Question: Note that examples of problems for you to find and solve can be: - Identify which suburb / location had the biggest growth in SalePrice

Note that examples of problems for you to find and solve can be:

-

Identify which suburb

/

location had the biggest growth in SalePrice by plotting and

examining the sale prices cross different suburbs;

-

Analyse a possible pattern of SalePrice vs YrSold

/

MoSold

,

LotArea and

/

or some other

variables which can reasonably be included;

-

Use predictions from your final model to compare suburbs which have shown

varying growth. Or

,

to identify which suburbs have been growing the most over the

last few years.

UG students

(

unit

11374)

: Generate and address at least five problems.

G students

(

unit

11517)

: Generate and address at least seven problems, including the last

problem listed above which uses predictions from your final model, e

.

.

find a way to

compare the predictions

(

maybe median?

)

between suburbs

(

could be the top

5

suburbs

)

which have shown varying growth from your time series plots of growth over time.

2 .

Data preprocessing:

In this section you should:

-

Preprocess your code, treat missing values etc.

-

Note at least one key observation, e

.

.

identified possible missing values or outliers

for a particular area

/

suburb or year e

.

. 2016

is significantly higher. Or perhaps one

column is missing more than

50 %

of its values.

3 .

EDA:

In this section you should:

-

Include tasks such as determining which variables are significant, which observations

may be outliers etc., and other EDA goals.

-

Find as much insight as possible to support your modelling decisions later on

.

-

Use data visualisation techniques taught in the unit to answer your chosen problems

of interest.

4 .

Further preprocessing:

In this section you should:

-

Select the final variables for your model based off your EDA

(

basically remove the

non

-

significant variables

) .

-

Create any new variables which you think may help based on your EDA in this

section.

-

Justify your decisions and provide EDA evidence as to how a variable is insignificant

(

.

.

no observable relationship to target variable in scatter plot

) .

5 .

Modelling:

In this section you should:

-

Fit and evaluate a linear model to describe the relationship between your target

variable and a number of selected significant predictors.

-

Use your model to predict the prices of properties described by your test dataset.

Alternatively, you may use another, more advanced model of your choice. If you do use a

linear model, remember its likings such as a normalised distribution in the target variable.

6 .

Evaluation:

You should:

-

Evaluate your model against the metric RMSE given the actual values in the test

dataset

-

Plot the residuals similar to that shown in the Week

10

slides. Pick a suitable cut off

value for the red dots.

The data science methodology is an iterative process. Try to minimise your RMSE, so always

go back and think about what improvements can be made, then fit another model, and find

your second RMSE, and so on

,

noting what works and what does not. Compare at least two

different models you considered, noting their differences.

7 .

Recommendations and final conclusions:

You should:

-

Summarise your findings and provide your found solutions to your problems of

interest. Note anything you found particularly interesting and useful to your project.

-

State the best RMSE you obtained and why

/

how

(

.

.

what variables you used, any

applied transformations etc.

) .

-

State any improvements you could make and why

/

how you could achieve such

improvements in future works.

8 .

References:

You should:

-

Include a reference list and cite your references via in

-

text referencing or footnotes.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

karnaugh maps are useful for finding minimal implementations of boolean expressions with only a few variables. however, they can be a little tricky when don't cares (x) are involved. using the...

Paragraph Styles Editing Vo Estimated campaign impact: [insert] Action Plan Outline the specific activities you must complete in order to execute your marketing campaign. Each element of your...

I have attached the case study and the rubric for this. can I get help here to organzie an answer? CATERPILLAR, INC: THE IMPACT OF DECISION BIASES AND RISK ON CAPITAL BUDGETING Timothy D. West...

10-K Ford Motor Company Review Ford Motor Company's Form 10-K for 2012. Explain the purpose of a company?s 10-K and how it interprets the firm?s financial strength. Write a description of three...

Discuss fully the future trends that will affect training. choose four only. Part 4 Social Responsability and the Future Training for Sustainability Sustainability refers to a company's ability to...

Discuss the future trends that will affect training. INTRODUCTION The previous ten chapters discussed management, and training's role in contr ous ten chapters discussed training design and delivery,...

125% Zoom Add Page T Teach O Format Document View Insert Table Chart Shape Media Media Comment Collaborate WAYMAKER PRINCIPLES OF MARKETING Marketing Plan Template Executive Summary Do this section...

Describe the types of cybercrimes facing organizations and critical infrastructures, explain the motives of cybercriminals, and evaluate the financial Explain both low-tech and high-tech methods...

3 COLLEGE ALGEBRA - TRIGONOMETRY Business and Finance (MAT115) This course will start with a review of basic algebra (factoring, solving linear equations, and equalities, etc.) and proceed to a study...

Management 587 Case/Assignment/Summary Activity Name Texas A&M-Commerce In partial fulfillment of the requirements for MGT 587 Professor Lloyd M. Basham June 8, 2014 (The above [and the next 3 lines]...

The New World Reality of Benefits Communication Alexander, Sheri. Employee Benefit Plan Review 68.11 (May 2014): 13-14. One of the biggest challenges of modern benefits is explaining them to...

You have visited a fast food restaurant such as KFC. This restaurant has a service blueprint to serve fast food to customers. Service blueprint has major steps of focusing on the customer and...

Would there be a violation of the Paycheck Fairness Act if this legislation has amended the FLSA?

The accounting method that records revenues and expenses when they occur is called International Financial Reporting Standards open book management generally accepted accounting principles ( GAAP )

2 The area of a circle increases at a rate of 6 cm s a How fast is the radius changing when the radius is 4 cm b How fast is the radius changing when the circumference is 2 cm a Write an equation...