Question: Please help fast urgent. Please asnwer BOTH QUESTIONS in python. Dataset 1 has four columns that are called Mood, Effort, Score, Output. The mood is

Please help fast urgent. Please asnwer BOTH QUESTIONS in python. Dataset 1 has four columns that are called Mood, Effort, Score, Output. The mood is either Happy, Neutral or Sad. The effort is either High, Medium, or Low. The Score is numbers between 10-100. The output has only two values that are yes or no. There are total of 19 rows. Dataset 2 has six columns called Age, Sex, BP, Cholestrol, Na_to_K, Output. The age has values between 10-100. The Sex is either F or M. BP is either High, Normal, or Low. Cholestrol is either High, or Normal. Na_to_k has float values from 0-40. The output is either drugA, drugB, drugC, drugX, drugY. It has 200 rows. Please help fast urgent.Problem 1: (15 marks)
Inspect the dataset titled lab01_dataset_1.csv which has a mixture of numerical and
categorical data. Your task will be to write a function my_ID3() which can create a decision
tree for the given dataset using the ID3 algorithm. However, before doing that, you will be
have to perform some data processing tasks. Here are all the required tasks in order -
ID3 cannot handle continuous numerical data. Perform necessary operations to
handle all continuous-valued attributes. Do not forget to show the output i.e., the
updated dataset after handling continuous-valued attributes. (2 marks)
Next, you will have to ensure the newly obtained dataset is optimal and free of
errors. Take appropriate actions based on the outcomes.
a. Check if the dataset has any missing values. (1 mark)
b. Check if the dataset has any redundant or repeated input sample. (1 mark)
c. Check if the dataset has any contradicting x=YProblem 2: (10 marks)
Inspect the dataset titled lab01_dataset_2.csv which also has a mixture of numerical and
categorical data. For this problem, you will use decision tree classifiers for supervised
learning. In particular, you will be using the functionalities of the sklearn.tree library. The
classification task using sklearn libraries work only on numerical-valued attributes, and not
on categorical ones. (What to do now? Hint: Look up One-hot Encoding and Integer
Encoding). Here are all the required tasks -
Restructure the dataset such that it has all numerical-valued attributes. (2 marks)
Perform supervised learning using decision tree classifiers. Employ the train-test
split approach during the learning. (4 marks)
After the learning is complete, show the results by predicting the class of the test
set. Display the results of the prediction and test set side-by-side. (2 marks)
Output the decision tree; it can be either a textual representation or a graphical
representation. (2 marks)Problem 1: (15 marks)
Inspect the dataset titled lab01_dataset_1.csv which has a mixture of numerical and
categorical data. Your task will be to write a function my_ID3() which can create a decision
tree for the given dataset using the ID3 algorithm. However, before doing that, you will be
have to perform some data processing tasks. Here are all the required tasks in order -
ID3 cannot handle continuous numerical data. Perform necessary operations to
handle all continuous-valued attributes. Do not forget to show the output i.e., the
updated dataset after handling continuous-valued attributes. (2 marks)
Next, you will have to ensure the newly obtained dataset is optimal and free of
errors. Take appropriate actions based on the outcomes.
a. Check if the dataset has any missing values. (1 mark)
b. Check if the dataset has any redundant or repeated input sample. (1 mark)
c. Check if the dataset has any contradicting x=Y
 Please help fast urgent. Please asnwer BOTH QUESTIONS in python. Dataset

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!