Question: Evaluating Predictive Models In addition to accuracy, precision and recall are two other metrics we can use in evaluating the performance of predictive classifiers. Precision
Evaluating Predictive Models
In addition to accuracy, precision and recall are two other metrics we can use in evaluating the performance of predictive classifiers. Precision is defined as the ratio of TP/TP + FP, and Recall is defines as ratio of TP / TP + FN. Where TP is True Positives, FP is false positives, TN is True Negative and FN is false negative.
For example, if the prediction problem is to identify customers who will purchase a product and the model predictions are as follows
Of the 10 customers the model predicted would buy the product, only 5 purchased the product (5 True Positives), the other 5 did not buy the product (False Positives). Precision = 5/10 = 0.5 (or 50%)
There were total 20 customers who purchased the product, of which the model correctly identified 5 of them (5 True Positives), for the other 15, the model wrongly predicted them as not buying the product (False Negative). The recall is 5/20 = 0.25 (25 %).
Base on the business problem, either precision or recall would be more important. For one of the two scenarios below, identify which measure would be more important and present your reasoning.
Question 1
The problem is to predict risk of diabetes. If a patient is identified as at risk of diabetes, further testing is done and lifestyle and diet change measures are prescribed. Among 25 patients, the model predicts 5 patients at risk of diabetes, of which only 4 are truly at risk of diabetes (TP) and one is not (FP). In total there are 10 patients who are at risk of diabetes, of which the model correctly identified 4 as at risk (TP) and wrongly predicted 6 patients as not at risk (False Negative). What is the precision and recall, which one is more important in this case?
Question 2
The problem is to predict whether an em ail message is spam or not. If a message is labeled as spam, it is deleted, if not it is sent to Inbox. Out of 25 messages, the model predicts 5 messages as spam, of which only 4 are truly spam (TP) and one is not (FP). In total there are 10 messages that are truly spam messages, of which the model correctly identified 4 as spam (TP) and wrongly predicted 6 messages as not spam (False Negative). What is the precision and recall, which one is more important in this case?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
