Question: Consider the training data set collected by the ABC bank as shown in the following Table. The last column is the target value (label).
Consider the training data set collected by the ABC bank as shown in the following Table. The last column is the target value (label). We would like to build a decision tree to classify new customers as "High" or "Low" credit customers. Consider the following two candidate splits at the root using the "Savings" attribute only. ii. using the "Assets" attribute only. (a) Draw a decision tree for each of the above candidate splits (b) Calculate the Gini index of the split. Based on your calculation of the Gini index, which split is better? (c) Calculate the Entropy index of the split. Based on your calculation of the Entropy index, which split is better? (Note: For simplicity, use Logi0 in your calculations) Customer Savings Assets 1 Medium 2 3 4 5 6 7 8 Low High Medium Low High Low High Low Credit High Low Medium Low Medium High Medium High High Low High High Low Medium Medium
Step by Step Solution
3.37 Rating (147 Votes )
There are 3 Steps involved in it
b Savings Low High Total Low 2 1 3 Medium 0 3 3 High 1 1 2 Root 3 5 8 Formula Gini p 1 2 p 2 2 Gini ... View full answer
Get step-by-step solutions from verified subject matter experts
