Question: Consider the mutual information based feature selection. Suppose we have the follow- ing table (the entries in table indicate counts) for the spam versus and

Consider the mutual information based feature selection. Suppose we have the follow- ing table (the entries in table indicate counts) for the spam versus and non-spam emails:

"prize"=1 "prize"=0
"spam"=1 150 10
"spam"=0 1000 15000

"hello"=1 "hello"=0
"spam"=1 155 5
"spam"=0 14000 1000

Given the two tables above, calculate the mutual information for the two keywords, "prize" and "hello" respectively. Which keyword is more informative for deciding whether or not the email is a spam?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!