Question: Suppose we have a random dataset, i.e. the attribute values are generated independently from the class label, and it contains data points belongs to either

Suppose we have a random dataset, i.e. the attribute values are generated independently from the class label, and it contains data points belongs to either POSITIVE or NEGATIVE classes. Now we need to build a classifier for such a dataset and we use half of the dataset for training while the remaining half for testing purpose. Please answer following questions and provide brief explanation for your answers:

(a)Suppose there are an equal number of positive and negative records in the data and the decision tree classifier predicts every test record to be positive. What is the expected error rate of the classifier on the test data?

(b)Repeat the previous analysis assuming that the classifier predicts each test record to be positive class with probability 0.8 and negative class with probability 0.2.

(c)Suppose two-thirds of the data belong to the positive class and the remaining one-third belong to the negative class. What is the expected error of a classifier that predicts every test record to be positive?

(d)Repeat the previous analysis assuming that the classifier predicts each test record to be positive class with probability 2/3 and negative class with probability 1/3.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!