Question: Split Impurity Calculations A B Class 1 T F C1 2 T F C1 3 T F C1 4 F T C1 5 F F
Split Impurity Calculations
|
| A | B | Class |
| 1 | T | F | C1 |
| 2 | T | F | C1 |
| 3 | T | F | C1 |
| 4 | F | T | C1 |
| 5 | F | F | C2 |
| 6 | T | T | C2 |
| 7 | F | T | C2 |
| 8 | T | T | C2 |
| 9 | T | T | C2 |
What is the Gini Index for the 9 data points without splitting? You can compute it given that 4 data points belong to C1 & 5 belong to C2.
What is the Gini Index if the data points are split based on attribute A?
Remember that you will need to split the 9 data points into 2 nodes, one contains all data points with A=T, and another node that contains all data points with A=F.
Then compute the Gini index for each of the two nodes.
Then combine the two Gini values using a weighted average to get the overall Gini Index for Split based on attribute A.
Re-watch the video if you are confused on how to compute the Gini Index for a split.
What is the Gini Index if the data points are split based on attribute B?
You will just repeat what you did in Q2 but using attribute B instead of A.
Which attribute gives a purer split?
Repeat steps 1, 2 & 3 using Entropy instead of Gini.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
