The following table lists a dataset containing the details of six patients. Each patient is described in
Question:
The following table lists a dataset containing the details of six patients. Each patient is described in terms of three binary descriptive features (Obese, Smoker, and Drinks Alcohol) and a target feature (Cancer Risk)
ID | Obese | Smoker | Drinks Alcohol | Cancer Risk |
---|---|---|---|---|
1 | true | false | true | low |
2 | true | true | true | high |
3 | true | false | true | low |
4 | false | true | true | high |
5 | false | true | false | low |
6 | false | true | true | high |
a. Which of the descriptive features will the ID3 decision tree induction algorithm choose as the feature for the root node of the decision tree?
b. When designing a dataset, it is generally a bad idea if all of the descriptive features are indicators of the target feature taking a particular value. For example, a potential criticism of the design of the dataset in this question is that all the descriptive features are indicators of the Cancer Risk Target feature taking the same level, high. Can you think of any descriptive features that could be added to the dataset that are indicators of the low target level?
Applied Regression Analysis and Other Multivariable Methods
ISBN: 978-1285051086
5th edition
Authors: David G. Kleinbaum, Lawrence L. Kupper, Azhar Nizam, Eli S. Rosenberg