Question: Split Impurity Calculations A B Class 1 T F C1 2 T F C1 3 T F C1 4 F T C1 5 F F

Split Impurity Calculations

A

B

Class

1 T F C1
2 T F C1
3 T F C1
4 F T C1
5 F F C2
6 T T C2
7 F T C2
8 T T C2
9 T T C2

What is the Gini Index for the 9 data points without splitting? You can compute it given that 4 data points belong to C1 & 5 belong to C2.

What is the Gini Index if the data points are split based on attribute A?

Remember that you will need to split the 9 data points into 2 nodes, one contains all data points with A=T, and another node that contains all data points with A=F.

Then compute the Gini index for each of the two nodes.

Then combine the two Gini values using a weighted average to get the overall Gini Index for Split based on attribute A.

Re-watch the video if you are confused on how to compute the Gini Index for a split.

What is the Gini Index if the data points are split based on attribute B?

You will just repeat what you did in Q2 but using attribute B instead of A.

Which attribute gives a purer split?

Repeat steps 1, 2 & 3 using Entropy instead of Gini.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!