Question: One of the central steps in any data mining task to preprocess the data. This includes normalization, handling missing values, feature selection, and so on

One of the central steps in any data mining task to preprocess the data. This
includes normalization, handling missing values, feature selection, and so on. So,
Alice wants to prepare the data first before she tries to apply any data mining
algorithm. To assist Alice in her quest, please answer the following questions:
(a) By looking into the data (or your answer to (2.c), do you think normalization
as described in class is important for this task? Why or why not? (Hint:
do the attributes have the same scale? Shoud we treat a di
erence in GPA
from 3.0 to 4.0 the same as we treat a di
erence in age from 20 to 21?)
(b) The data contains some missing values. So, Alice has to handle them some-
how. She considers the many options described in the class (e.g. eliminate,
replace, estimate, and so on). By looking at the data, answer the following
questions:
Is it a good idea to eliminate the records with the missing values?
(Hint: how will this impact the number of available records?).
Explain how Alice can estimate the missing values of the following
attributes: age, GPA, gender, pre-requisite, and pre-test score.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!