Question: 1. [15 points) Consider the mutual information based feature selection. Suppose we have the following table [the entries in table indicate counts) for the spam


1. [15 points) Consider the mutual information based feature selection. Suppose we have the following table [the entries in table indicate counts) for the spam versus and nonspam emails: \"prize\" 2 1 \"prize\" 2 0 \"spam\" 2 1 150 10 \"spam\" = 0 1000 15000 \"hello\" : 1 \"hello\" : 0 \"spam\" 2 1 155 5 \"spam\" 2 0 14000 2000 Given the two tables above1 calculate the mutual information for the two keywords, \"prize\" and \"hello" respectively. Which keyword is more informative for deciding whether or not the email is a spam
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
