Question: download data csv from here : https://we.tl/t-vfij9A9Hl1 list of all numerical variables that contain missing data and print out the percentage of missing values per

download data csv from here : https://we.tl/t-vfij9A9Hl1

list of all numerical variables that contain missing data and print out the percentage of missing values per variable (use the training data).

Using the result of the previous step: For numerical variables with less than 15% of data missing, replace missing data with the mean of the variable, in other variables replace the missing data with the median of the variable in the training set (Apply the replacement to X_train and X_test and make sure it is based on the results you have obtained from the training set).

In the train and test sets, replace the values of variables 'YearBuilt', 'YearRemodAdd' and 'GarageYrBlt' with the time elapsed between them and the year in which the house was sold 'YrSold'. After that drop the 'YrSold' column

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!