Question: 2 Decision Tree ( Theory ) ( Matthew ) - 1 0 pts In class, we primarily used information gain ( the overall reduction in

2 Decision Tree (Theory)(Matthew)-10 pts
In class, we primarily used information gain (the overall reduction in entropy) as the criterion for selecting
which feature to split on. Another method would be to use the Gini index. For multi-class, the Gini index
is calculated as follows, where pk represents the fraction of samples in class k and p represents the set of all
probabilities pk:
Gini index(p)=1
X
K
k=1
p
2
k
.
Consider a dataset comprising 1200 data points, with 400 data points from class C1,500 data points
from class C2, and 300 data points from class C3. Suppose that decision tree model A splits the data into
three leaves. Assume that the label distribution is (300,50,50) at the first leaf, (50,400,50) at the second
leaf, and (50,50,200) at the third leaf, where (n1, n2, n3) denotes the number of points from C1, C2, and
C3, respectively. Similarly, suppose that decision tree model B splits the data into (300,0,100) at the first
leaf, (100,400,0) at the second leaf, and (0,100,200) at the third leaf. Answer the questions below,
showing all your work.
(a) Evaluate the misclassification rates for both decision trees and hence show that they are equal.
(b) Evaluate and compare the Weighted Gini index for the two trees which performs better in terms of
Weighted Gini index? The Weighted Gini index is calculated as:
Weighted Gini = X
j=1
nj
N
Gini(pj )
where nj
N
is the weight of node j, calculated as the proportion of data points nj in node j to the total number
of data points N, Gini(pj ) is the Gini index of leaf j, and is the total number of leaves in the tree.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!