Question: In order to reduce email load, I decided to implement a machine learning algorithm to decide whether or not I should read an email, or
In order to reduce email load, I decided to implement a machine learning algorithm to decide whether or not I should read an email, or simply let it away instead. To train my model, I obtain the following data set of binary-valued features about each email, including whether I know the author or not, whether the email is long or short, and whether it has any of several keywords, along with my final decision about whether to read it (y = +1 for 'read', y = -1 for 'discard').

Problem 4: Bayes Classifiers In order to reduce my email load, I decided to implement a machine learning algorithm to decide whether or not I should read an email, or simply let it away instead. To train my model, I obtain the following data set of binary-valued features about each email, including whether I know the author or not, whether the email is long or short, and whether it has any of several keywords, along with my final decision about whether to read it (y = +1 for 'read', y = -1 for 'discard"). know is long? has has has 'lottery' read the 'research'? grade'? x5 y author? x2 x3 x4 x1 0 0 1 0 -1 1 1 0 0 -1 0 1 -1 1 1 0 -1 0 0 -1 0 +1 0 0 +1 1 0 0 0 0 +1 0 +1 -1 In the case of any ties, we will prefer to predict class +1. I decide to try a Bayes classifier to make my decisions and compute my uncertainty. (a) Compute all the probabilities necessary for a naive Bayes classifier, i.e., the class probability p(y) and all the individual feature probabilities p(x,|y), for each class y and feature x;. Which class would be predicted for x = (0 0 0 0 0)? What about for x = (1 10 10)? (b) Compute the posterior probability that y = +1 given the observation x = (1 10 10). (c) Why should we probably not use a Bayes classifier (using the joint probability of the features x, as opposed to a naive Bayes classifier) for these data
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
