Question: Consider the training examples shown in Table 4.1 for a binary classification problem. a) Compute the Gini index for the overall collection of training examples.
.png)
a) Compute the Gini index for the overall collection of training examples.
(b) Compute the Gini index for the Customer ID attribute.
(c) Compute the Gini index for the Gender attribute.
(d) Compute the Gini index for the Car Type attribute using multiway split.
(e) Compute the Gini index for the Shirt Size attribute using multiway split.
(f) Which attribute is better, Gender, Car Type, or Shirt Size?
(g) Explain why Customer ID should not be used as the attribute test condition even though it has the lowest Gini.
Table 4.1. Data set for Exercise 2. Customer ID Gender Car Type Shirt Size Class CO CO M Family M Sports M Sports Sports Small Medium M Sports MeduCo LargeCO M Sports Extra Large CO M Sports Extra Large CO CO CO F SportsSmall Small Sports F SpsMediumcO F LxuryLrge CO LargeC1 M Family Extra Large C1 M ilyMediu C1 M LuxuryExtra Large C1 F LuuxryS C1 F LuuxurySC1 F LyMe C1 F LxuryMed C1 FLuxuryMe C1 C1 9 10 M Family 12 13 14 15 16 17 18 19 20 Luxury Large
Step by Step Solution
3.38 Rating (170 Votes )
There are 3 Steps involved in it
a Gini 1 2 0 5 2 0 5 b The gini for each Customer ID value is 0 Therefore the overall gini f... View full answer
Get step-by-step solutions from verified subject matter experts
Document Format (1 attachment)
908-M-S-D-A (8608).docx
120 KBs Word File
