Peter Derby works as a cyber security analyst at a private equity firm. His colleagues at the

Question:

Peter Derby works as a cyber security analyst at a private equity firm. His colleagues at the firm have been inundated by a large number of spam e-mails. Peter has been asked to implement a spam detection system on the company’s e-mail server. He reviewed a sample of 500 spam and legitimate e-mails with relevant variables: spam (1 if spam, 0 otherwise), the number of recipients, the number of hyperlinks, and the number of characters in the message. A portion of the Spam_ Data worksheet is shown in the accompanying table. 


a. Perform KNN analysis to estimate a classification model for spam detection using the Spam_Data worksheet and score new e-mails in the Spam_Score worksheet. What is the optimal value of k? 

b. Report the overall accuracy, specificity, sensitivity, and precision rates for the test data set (for Analytic Solver) or validation data set (for R). 

c. What is the area under the ROC curve (or AUC value)? 

d. What is the predicted outcome for the first new e-mail?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Business Analytics Communicating With Numbers

ISBN: 9781260785005

1st Edition

Authors: Sanjiv Jaggia, Alison Kelly, Kevin Lertwachara, Leida Chen

Question Posted: