Question: Data Mining Consider the following approach for testing whether a classifier A beats another classifier B. Let N be the size of a given dataset,
Data Mining
Consider the following approach for testing whether a classifier A beats another classifier B. Let N be the size of a given dataset, pA be the accuracy of classifier A, pB be the accuracy of classifier B, and p=(pA+pB)/2 be the average accuracy for both classifiers. To test whether classifier A is significantly better than B, the following Z-statistic is used:
Z=(pApB) / (2p(1p)) / N)
Classifier A is assumed to be better than classifier B if Z>1.96.
Table 3.8 compares the accuracies of three different classifiers, decision tree classifiers, nave Bayes classifiers, and support vector machines, on various data sets. (The latter two classifiers are described in Chapter 4.)
Summarize the performance of the classifiers given in Table 3.8 using the following 33 table:
| win-loss-draw | Decision tree | Nave Bayes | Support vector machine |
|---|---|---|---|
| Decision tree | 0 - 0 - 23 | ||
| Nave Bayes | 0 - 0 - 23 | ||
| Support vector machine | 0 - 0 - 23 |
Table 3.8:
| Data Set | Size | Decision | nave | Support vector |
| (N) | Tree (%) | Bayes (%) | machine (%) | |
| Anneal | 898 | 92.09 | 79.62 | 87.19 |
| Australia | 690 | 85.51 | 76.81 | 84.78 |
| Auto | 205 | 81.95 | 58.05 | 70.73 |
| Breast | 699 | 95.14 | 95.99 | 96.42 |
| Cleve | 303 | 76.24 | 83.5 | 84.49 |
| Credit | 690 | 85.8 | 77.54 | 85.07 |
| Diabetes | 768 | 72.4 | 75.91 | 76.82 |
| German | 1000 | 70.9 | 74.7 | 74.4 |
| Glass | 214 | 67.29 | 48.59 | 59.81 |
| Heart | 270 | 80 | 84.07 | 83.7 |
| Hepatitis | 155 | 81.94 | 83.23 | 87.1 |
| Horse | 368 | 85.33 | 78.8 | 82.61 |
| Ionosphere | 351 | 89.17 | 82.34 | 88.89 |
| Iris | 150 | 94.67 | 95.33 | 96 |
| Labor | 57 | 78.95 | 94.74 | 92.98 |
| Led7 | 3200 | 73.34 | 73.16 | 73.56 |
| Lymphography | 148 | 77.03 | 83.11 | 86.49 |
| Pima | 768 | 74.35 | 76.04 | 76.95 |
| Sonar | 208 | 78.85 | 69.71 | 76.92 |
| Tic-tac-toe | 958 | 83.72 | 70.04 | 98.33 |
| Vehicle | 846 | 71.04 | 45.04 | 74.94 |
| Wine | 178 | 94.38 | 96.63 | 98.88 |
| Zoo | 101 | 93.07 | 93.07 | 96.04 |
Each cell in the table contains the number of wins, losses, and draws when comparing the classifier in a given row to the classifier in a given column.
Please be thorough in explanation/work to get to answer
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
