Question: 1. A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as

 1. A data mining routine has been applied to a transaction

1. A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as non-fraudulent (920 correctly so).Construct the confusion matrix and calculate the overall error rate, precision and recall for fraudulent items. 2. A large number of insurance records are to be examined to develop a model for predicting fraudulent claims. Of the claims in the historical database, 1% were judged to be fraudulent. A sample is taken to develop a model, and oversampling is used to provide a balanced sample in light of the very low response rate. When applied to this sample (n=800), the model ends up correctly classifying 310 frauds, and 270 non frauds. It missed 90 frauds, and classified 130 records incorrectly as frauds when they were not. a. Produce the confusion matrix for the sample as it stands. b. Find the adjusted misclassification rate (adjusting for the oversampling). c. What percentage of new records would you expect to be classified as fraudulent? 3 Describe the training and validation participation. Describe 4 -fold cross validation, and calculate the 4 -fold cross validation accuracy. prediction (of Label 1), recall (of Label 0) for the following example

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!