Question: Question 5 . Since 2 0 1 9 , a novel coronavirus disease COVID - 1 9 has subsequently more than 1 3 6 ,
Question Since a novel coronavirus disease COVID has subsequently more than COVID cases globally. Assume you would like to develop a classification model to predict COVID based on the symptom features. You have a dataset of patient records to consider in the training process. Each record describe a patient by their attributes of IsFever, IsTiredness, and IsCough. Each attribute has two values of indicating True and indicating False The class label tells us whether a patient gets COVID infection, where represents False, and represents True. The following table summarizes the patient dataset and two class labels
tableIs Fever,tableIsTirednesstableIsCoughtableNumber of Patientswith Class tableNumber of Patientswith Class TTTFTTTFTFFTTTFFTFTFFFFF
a Construct a decision tree classifier based on the records of labeled patients. Calculate the gain in the Gini index when splitting on attributes IsFever and IsTiredness, respectively. According to the Gini index based gains, which one will you choose as the first attribute to split in the decision tree induction?
marks
b Write a completed decision tree using the CART algorithm, which uses the Gini index based impurity measurement.
marks
c Calculate the training instances are misclassified by the resulting decision tree.
marks
d Predict the class labels of three new patients X Y and Z below, according to your constructed decision tree classifier.
tablePatientIsFever,IsTiredness,IsCough,IsCOVID
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
