Question: These question must Use R programming Languages. Please give answer back in code Country Age Salary Purchased France 44 72000 No Spain 27 NA Yes

These question must Use R programming Languages. Please give answer back in code

Country Age Salary Purchased
France 44 72000 No
Spain 27 NA Yes
Germany 30 54000 No
Spain 38 61000 N0
Germany 40 Yes
France 35 58000 Yes
Spain 52000 no
France 48 79000 Yes
Germany 50 83000 No
France NA 67000 Y
  • Read-in the Data. It is a synthetic dataset that shows customer information of a mortgage company.
  • Generate summary of missing values, and inconsistent values for each of the features. Your script should generate a table similar to the one shown below:

Features

Missing Values (MV)

% of MV

(MV/960)

Inconsistency Values (IV)

% of IV

(IV/960)

Country

Age

Salary

Purchased

An example of a missing value is NA in record 2 and the feature Salary. Another example of a missing value is for record 7 and the feature age.

An example of inconsistency is in record 10 and the feature purchased (Y for Yes).

  • Handle the missing values. Specifically, estimate missing values of age by computing the feature mean grouped by country. Similarly estimate missing values of Salary by computing the feature mean grouped by country. You may also use the target feature for the missing value estimation. Please mention in your script what strategy do you to handle the missing values.
  • Correct the data inconsistency issue. The target feature is a binary class (Yes and No). In the correct state of the date, however, it has 5 class labels (Yes, Y, No, no, and N0). Correct this problem by converting the values into the appropriate class labels (Yes and No).

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!