Question: Data Mining Question A dataset with 9 examples contains 3 features a1, a2, and a3, and a class variable. Two of the features are binary
Data Mining Question
A dataset with 9 examples contains 3 features a1, a2, and a3, and a class variable. Two of the features are binary and one is continuous, while the class variable is binary. The dataset is described below:
| a1 | a2 | a3 | Class |
| T | T | 1.0 | + |
| T | T | 6.0 | + |
| T | F | 5.0 | - |
| F | F | 4.0 | + |
| F | T | 7.0 | - |
| F | T | 3.0 | - |
| F | F | 8.0 | - |
| T | F | 7.0 | + |
| F | T | 5.0 | - |
Answer each of the following questions with respect to the dataset. You must show your work and intermediary calculations. You must do this yourself and not use a data mining tool. Please circle your final answers for each part.
- What is the misclassification error rate associated with splitting on a1 and with splitting on a2, and which feature yields the best split?
- What is the Gini value for a1 and a2 and which yields the best split (based on Gini)?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
