Question: c . A dataset S was collected and presented in the figure below. Each data sample has two features x 1 and x 2 and

c. A dataset S was collected and presented in the figure below. Each data sample has two features x1 and x2 and is classified as either cross or circle. You are asked to use Binary Decision Trees to build a binary classification model. (1) Based on the calculation of entropy-based information gain, at the root condition node, is it better to use x1 with the threshold of 10 or 2 with the threshold of 15?(2) Provide detailed calculations to support your answer. The formulas of Entropy and Information Gain are provided below.
Entropy: ,H(S)=i=1C-Pilog2(Pi), dataset S contains C classes.
Information Gain: IG(S,x)=H(S)-tinT?P(t)H(t), where S=UttinT
 c. A dataset S was collected and presented in the figure

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!