Question: The answers to this problem are based on R version 3 . 5 . 3 . To replicate the results with newer versions of R

The answers to this problem are based on R version 3.5.3. To replicate the results with newer versions of R, execute the following line of code at the beginning of the R session or your R code: suppressWarnings(RNGversion("3.5.3")). For R, partition data sets into 60% training and 40% validation. Use the statement set.seed(1) to specify the random seed of 1 for both data partitioning and cross-validation. If the predictor variable values are in the character format, then treat the predictor variable as a categorical variable. Otherwise, treat the predictor variable as a numerical variable. Credit card fraud is becoming a serious problem for the financial industry and can pose a considerable cost to banks, credit card issuers, and consumers. Fraud detection using data mining techniques has become an indispensable tool for banks and credit card companies to combat fraudulent transactions. The accompanying data file contains the following variables: Fraud (1 if fraudulent activities, 0 otherwise), Amount (1 if low, 2 if medium, 3 if high), Online (1 if online transactions, 0 otherwise), and Prior (1 if products that the card holder previously purchased, 0 otherwise). A-Partition the data to develop a nave Bayes classification model. Report the accuracy, sensitivity, and specificity rates for the validation data set. Note: Round your answers to 2 decimal places. b-What is the lift value of the leftmost bar? c-What is the area under the ROC curve (or AUC value)? d-Which of the following statements is most accurate? a-By selecting the top 10% of the validation cases with the highest predicted probability of belonging to the target class, the nave Bayes model would identify more target class cases than if the cases are randomly selected. b-Using 0.5 as the cutoff rate, the nave Bayes models accuracy rate is higher than that of the nave rule (classifying all cases to the predominant class) for the validation data. C. The lift curve of the nave Bayes model lies slightly above the lift curve of the baseline model. D. Overall, the nave Bayes model performs better than the baseline model in terms of both sensitivity and specificity. E. All of the statements above are accurate Change the cutoff value to 0.1. Report the accuracy, sensitivity, specificity, and precision rates for the validation data set. Note: Round your answers to 2 decimal places. table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} tr {mso-height-source:auto;} col {mso-width-source:auto;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:11.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri, sans-serif; mso-font-charset:0; text-align:general; vertical-align:bottom; border:none; white-space:nowrap; mso-rotate:0;}.xl17{text-align:right;} Fraud Amount Online Prior 0201030001111200030103110211110003100101010101001300010103011201031103100210011103010210120101010311020103011300131103001100020103110201020001100200021102010301020112000211020101110100010003001211030101110300020102101100011001010101010102000201120001010211121001010201020002110101030103100111010001110300021113110201010101000200010113100300031001000300030102010100030002010301131002011311020102010301010103010301011102010111010102100100020102010100011103000301020103010300031101000300011113010200011102110301011101010101021102000300031103010101020001011311020101010301020001010211020001110301020103010100011002000200031102010310020101010301111103000301011103010201030003110310

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!