Question: Inspect the dataset titled lab 0 1 _ dataset _ 1 . csv which has a mixture of numerical and categorical data. Your task will
Inspect the dataset titled labdatasetcsv which has a mixture of numerical and
categorical data. Your task will be to write a function myID which can create a decision
tree for the given dataset using the ID algorithm. However, before doing that, you will be
have to perform some data processing tasks. Here are all the required tasks in order
ID cannot handle continuous numerical data. Perform necessary operations to
handle all continuousvalued attributes. Do not forget to show the output ie the
updated dataset after handling continuousvalued attributes. marks
Next, you will have to ensure the newly obtained dataset is optimal and free of
errors. Take appropriate actions based on the outcomes.
a Check if the dataset has any missing values. mark
b Check if the dataset has any redundant or repeated input sample. mark
c Check if the dataset has any contradicting pairs. mark
Your function myID should operate in a manner such that after ever round of
decision making, it will output the attributes and its associated gain, with a message
stating Attribute X with Gain Y is chosen as the decision attribute. Once your
function completes, it should output the decision tree. The representation of the
decision tree is upto you. You can choose either a textual representation or a
graphical one; either is fine. marks
This dataset is relatively small and easy to understand just by looking at it But you must
perform all the above tasks via coding. Bruteforcing the answers or directly solving the
mathematics involved in ID without coding it in Python will not get you a score.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
