Question: Question 4 (25%) A. Table 3 represents email data required to implement a statistic learning application for Table 3: Training data set for a spam

Question 4 (25%) A. Table 3 represents email data required to implement a statistic learning application for Table 3: Training data set for a spam email detection application detecting spam emails. Keyword 2 YES NO YES NO YES Keyword 3 YES YES YES NO NO Spam? YES YES YES NO NO Email ID Keyword1 YES YES NO YES Using these data and appropriate mathematical notation given below P(Spam YES) or P(Keyword 1 YES Spam YES) calculate the following probabilities 1) The probability that a particular email message is spam 2) The probability that a particular email message is non-spam 3) The probability of Keyword 1 occurring in a spam email 4) The probability of Keyword 2 occurring in a spam email 5) The probability of Keyword 3 occurring in a spam email 6) The probability of Keyword 1 occurring in a non-spam email 7) The probability of Keyword 2 occurring in a non-spam email 8) The probability of Keyword 3 occurring in a non-spam email 9) The probability of Keyword 3 NOT occurring in a spam email 10) The probability of Keyword 3 NOT occurring in a non-spam email (1 Mark) 1 Mark) (1 Mark) (1 Mark) (1 Mark) 1 Mark) (1 Mark) 1 Mark) (1 Mark) (1 Mark) B. A new email is received, which can be described by the instance of Email(YES, YES, NO) where Keyword 1 and Keyword 2 are present in this email instance (refer to Table 3 above) Assuming that Keyword 1 is conditionally independent of Keyword 2 and using the Naive Bayes rule to classify the new email to either Spam YES or Spam NO 1) Show the mathematical expression of the classification rule with the mathematical notation given in Question A above. The meaning of each component in this expression should be briefly explained. To complete this task, refer to the email classes as C, with i 1, 2, whene C, describes the spam email class (Spam YES) and C describes the non-spam email class (Spam No). Use symbols A, (where j-1, 2, 3) to refer to the attributes, i.e.A represents Keyword 1, A2 represents Keyword 2, and A, represents Keyword 3. (3 Marks) 2) Show the values calculated by the Bayes classification rule (10 Marks) 3) Show the classification decision made by the Bayes classification rule for the email received, Justify your answer in terms of the values generated by the Bayes rule. (2 Marks)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
