Question: 1. Conceptual question (30 points). (a) (15 points) Consider the mutual information based feature selection. Suppose we have the follow- ing table (the entries in

1. Conceptual question (30 points). (a) (15 points) Consider the mutual information based feature selection. Suppose we have the follow- ing table (the entries in table indicate counts) for the spam versus and non-spam emails: Given the two tables above, calculate the mutual information for the two keywords, \"prize\" and \"hello\" respectively. Which keyword is more informative for deciding whether or not the email is a spam? (b) (15 points) Given two distributions, f0 = N(O,1), f 1 = N(3, 1) (meaning that we are interested in detecting a mean shift of minimum size 3), derive what should be the CUSUM statistic (i.e., write down the CUSM detection statistic). Plot the CUSUM statistic for a sequence of randomly generated samples, 931, . . . ,mlog are are i.i.d. (independent and identically distributed) according to f0 and $101, . . .,:r2oo that are i.i.d. accordign to f1
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
