Question: Consider the dataset shown in Table 1 for a binary classification problem, Customer ID H ousing Type Gender Marital Status Class 11 Married Single House

 Consider the dataset shown in Table 1 for a binary classification

problem, Customer ID H ousing Type Gender Marital Status Class 11 Married

Single House Female Married Female Single Male Married Hostel Male Single -

Consider the dataset shown in Table 1 for a binary classification problem, Customer ID H ousing Type Gender Marital Status Class 11 Married Single House Female Married Female Single Male Married Hostel Male Single - House Female Married Apartment Female Single 18 8 8 5 5 8 8 5 5 8 8 5 5 8 Apartment Male Married House Male Single Hostel Female Married Hostel Female Single House Male Married Hostel Male Single Hostel Female Married Apartment Female Single Table 1 . d. 12.5 points) Compute the Gain Ratio for splitting over each of the four attributes. Which attribute provides the highest Gain Ratio? e. 12 points) For splitting at the root node, would you choose the attribute that provides the maximum IG, or the attribute that provides maximum Gain Ratio? Briefly explain your choice f. [3 points] Consider the following 3 decision trees: ( Marital Status Married Single Customer ID Gender) (Gender Tree 1 Tree 2 Housing Type Apartment House Hostel Gender Gender Gender Compute the difference between the entropy of overall data with the weighted entropy of the leaves for each of the three trees. Based on these differences, which tree would you choose for performing classification is the attribute chosen at the root of this tree same as the attribute chosen for splitting in 7 Briefly comment on the nature of your results, and the properties of the impurity measure used while constructing decision trees. Consider the dataset shown in Table 1 for a binary classification problem, Customer ID H ousing Type Gender Marital Status Class 11 Married Single House Female Married Female Single Male Married Hostel Male Single - House Female Married Apartment Female Single 18 8 8 5 5 8 8 5 5 8 8 5 5 8 Apartment Male Married House Male Single Hostel Female Married Hostel Female Single House Male Married Hostel Male Single Hostel Female Married Apartment Female Single Table 1 . d. 12.5 points) Compute the Gain Ratio for splitting over each of the four attributes. Which attribute provides the highest Gain Ratio? e. 12 points) For splitting at the root node, would you choose the attribute that provides the maximum IG, or the attribute that provides maximum Gain Ratio? Briefly explain your choice f. [3 points] Consider the following 3 decision trees: ( Marital Status Married Single Customer ID Gender) (Gender Tree 1 Tree 2 Housing Type Apartment House Hostel Gender Gender Gender Compute the difference between the entropy of overall data with the weighted entropy of the leaves for each of the three trees. Based on these differences, which tree would you choose for performing classification is the attribute chosen at the root of this tree same as the attribute chosen for splitting in 7 Briefly comment on the nature of your results, and the properties of the impurity measure used while constructing decision trees

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!