Spam filters try to sort your e- mails, deciding which are real messages and which are unwanted. One method used is a point system. The filter reads each incoming e-mail and assigns points to the sender, the subject, key words in the message, and so on. The higher the point total, the more likely it is that the message is unwanted. The filter has a cutoff value for the point total; any ­message rated lower than that cutoff passes through to your inbox, and the rest, suspected to be spam, are diverted to the junk mailbox. We can think of the filter’s decision as a hypothesis test. The null hypothesis is that the e-mail is a real message and should go to your inbox. A higher point total provides evidence that the message may be spam; when there’s sufficient evidence, the filter rejects the null, classifying the message as junk. This usually works pretty well, but, of course, sometimes the filter makes a mistake.
a) When the filter allows spam to slip through into your inbox, which kind of error is that?
b) Which kind of error is it when a real message gets classified as junk?
c) Some filters allow the user (that’s you) to adjust the cutoff. Suppose your filter has a default cutoff of 50 points, but you reset it to 60. Is that analogous to choosing a higher or lower value of a for a hypothesis test? Explain.
d) What impact does this change in the cutoff value have on the chance of each type of error?

  • CreatedMay 15, 2015
  • Files Included
Post your question