A large number of insurance records are to be examined to develop a model for predicting fraudulent

Question:

A large number of insurance records are to be examined to develop a model for predicting fraudulent claims. Of the claims in the historical database, 1% were judged to be fraudulent. A sample is taken to develop a model, and oversampling is used to provide a balanced sample in light of the very low response rate. When applied to this sample (n = 800), the model ends up correctly classifying 310 frauds and 270 non-frauds. It missed 90 frauds and classified 130 records incorrectly as frauds when they were not.

a. Produce the confusion matrix for the sample as it stands.

b. Find the adjusted misclassification rate (adjusting for the oversampling).

c. What percentage of new records would you expect to be classified as fraudulent?

Fantastic news! We've Found the answer you've been seeking!