Question: Implement a simple Bayesian Spam filter and determine if each email in the 'test' folder is spam or ham. - In the 'train' folder, two

Implement a simple Bayesian Spam filter and determine if each email in the 'test' folder is spam or ham.
- In the 'train' folder, two files contain 100 spam and 100 non-spam messages.
- In the 'test' folder, two files contain 20 spam and 20 non-spam messages.
- We want to classify 40 emails in the 'test' folder based on the emails in the 'train' folder.
- Thus, probabilies should be calculated from emails in the 'train' folder.
- Please ignore all special characters(e.g.'~!@#$%^&*()->?/....)
- Use C or C++. No other programming language is allowed. (ex. python, Java, etc)
[Procedue for decision.]
Per each email from test forder
calculate r(w1,......, wn) and apply a threshold.
assign a label (spam or non-spam) which is a predicted label.
Calculate the accuracy of your prediction.
(the number of correctly classified test emails) divided by 40.
- since we have 40 test emails (20 spam and 20 non-spam).
For fun, let's try various 'threshold (T)' for decision.
T =0.6,0.7,0.8,0.9,0.95.
This means that we will have 5 accuracies.
Implement a simple Bayesian Spam filter and

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!