Question: Consider a binary dataset with 400 examples, where half of them belongs to class A and the other half belongs to class B. Next consider

Consider a binary dataset with 400 examples, where half of them belongs to class A and the other half belongs to class B.

Next consider two decision stumps (i.e. trees with depth 1) T1 and T2, each with two children. For T1, its left child has 150 examples in class A and 50 examples in class B; for T2, its left child has 0 example in class A and 100 examples in class B. (You should infer what are in the right child.)

2.1 For each leaf of T1 and T2, compute the corresponding classification error, entropy (base e) and Gini impurity, rounding off your answer to 2 decimal places. (Note: the value/prediction of each leaf is the majority class among all examples that belong to that leaf.)

2.2 Compare the quality of T1 and T2 (that is, the two different splits of the root) based on classification error, conditional entropy (base e), and weighted Gini impurity respectively, rounding off your answer to 2 decimal places.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!