Question: the data set is too big to include. however, if you can provide me with instructions that would be super helpful. Audt Q1 015 011



the data set is too big to include. however, if you can provide me with instructions that would be super helpful.
Audt Q1 015 011 0 FRAUD 02 0 0 Q13 0 1 1 1 08 0 0 0 0 Q14 1 1 1 020 0 0 Q19 0 0 0 0 022 0 1 023 0 0 1 2 3 4 1 1 Q12 0 0 0 0 0 0 1 1 1 09 0 o 0 0 0 0 o 0 0 0 1 016 0 0 1 0 1 0 010 0 0 0 O 1 0 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 03 0 0 0 0 0 0 0 0 0 0 0 0 024 1 0 0 0 0 0 0 0 1 0 1 5 6 7 0 0 1 1 1 1 0 0 1 0 1 1 1 0 Q5 O 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 O 0 1 0 0 0 0 8 1 Q18 0 0 O 0 0 0 0 0 0 0 0 0 0 0 1 1 1 24 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 0 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 1 Q17 0 0 0 0 0 0 0 o 0 0 0 0 0 o o 0 0 0 o 0 1 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 1 021 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 O 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o 1 1 0 1 0 0 0 1 1 1 o 0 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 1 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 1 o 1 0 1 o 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 0 0 0 1 O 0 1 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 1 0 1 0 0 0 0 1 1 1 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 o 1 1 1 0 1 1 1 0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 O 0 1 1 1 1 0 0 1 0 0 1 0 0 1 0 1 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 0 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 o 0 0 o o 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 O 0 0 0 1 0 0 0 0 1 0 1 0 0 1 1 0 0 o o 0 0 0 o 0 0 0 0 0 0 1 0 0 0 0 0 o 0 0 0 0 0 0 0 0 0 0 1 1 0 0 o 0 0 0 0 0 06 07 0 0 0 o 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 0 1 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 o 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 o 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 o 0 1 0 o 1 o 1 o 0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 1 1 0 0 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 o 1 0 1 0 1 0 1 0 1 1 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 0 1 0 1 1 1 0 0 1 0 0 1 1 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 o 0 0 0 o 0 0 0 o 0 0 0 0 0 0 1 1 1 0 1 1 1 0 o 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 1 0 0 0 1 0 0 1 0 1 0 1 0 0 1 0 1 o 1 0 1 o 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 CASE 10.1 Detecting Management Fraud In the wake of the Enron scandal in 2002 two public accounting firms, Oscar Anderson (OA) and Trice-Milkhouse-Loopers (TML), merged (forming OATML) and are review- ing their methods for detecting management fraud during audits. The two firms had each developed their own set of questions that auditors could use in assessing manage- ment fraud. To avoid a repeat of the problems faced by Enron's auditors, OATML wants to develop an automated decision tool to assist auditors in predicting whether or not their clients are engaged in fraudulent management practices. This tool would basically ask an auditor all the OA or TML fraud detection questions and then automatically render a decision about whether or not the client company is engaging in fraudulent activities. The decision problem OATML faces is really two-fold: 1) Which of the two sets of fraud detection questions are best at detecting fraud? and, 2) What's the best way to translate the answers to these questions into a prediction or classification about management fraud? To assist in answering these questions, the company has compiled an Excel spread- sheet (the file Fraud.xlsm accompanying this book) that contains both the OA and TML fraud detection questions and answers to both sets of questions based on 382 audits previously conducted by the two companies (see sheets OA and TML, respectively). (Note: for all data 1=yes, (=no.) For each audit, the last variable in the spreadsheet indi- cates whether or not the respective companies were engaged in fraudulent activities (i.e., 77 audits uncovered fraudulent activities, 305 did not). You have been asked to perform the following analysis and provide a recommenda- tion as to what combination of fraud questions OATML should adopt. 1. For the OA fraud questions, create a correlation matrix for all the variables. Do any of the correlations pose a concern? 2. Using the 8 questions that correlate most strongly with the dependent fraud vari- able, partition the OA data with oversampling to create a training and validation data sets with a 50% success rate in the training data. (Use the default seed of 12345.) 3. Use each of XLMiner's classification techniques to create classifiers for the parti- tioned OA dataset. Summarize the classification accuracy of each technique on the training and validation sets. Interpret these results and indicate which technique you would recommend OATML use. 4. For the TML fraud questions, create a correlation matrix for all the variables. Do any of the correlations pose a concern? 5. Using the 8 questions that correlate most strongly with the dependent fraud vari- able, partition the TML data with oversampling to create training and validation data sets with a 50% success rate in the training data. (Use the default seed of 12345.) 6. Use each of XLMiner's classification techniques to create classifiers for the parti- tioned TML dataset. Summarize the classification accuracy of each technique on the training and validation sets. Interpret these results and indicate which technique you would recommend OATML use. 7. Suppose OATML wants to use both fraud detection instruments and combine their individual results to create a composite prediction. Let LR1 represent the logistic regression probability estimate for a given company using the OA fraud detection instrument and LR2 represent the same company's logistic regression probability estimate using the TML instrument. The composite score for the company might then be defined as C = w LR2 + (1 - w.)LR2 where 0 = w= 1. A decision rule could then be created where we classify the company as non-fraudulent if C is less than or equal to some cut-off value, and is otherwise considered fraudulent. Use Solver's evolutionary optimizer to find the optimal value of w, and the cut-off value that minimizes the number of classification errors for the training data. What do you obtain for W and the cut-off value? Summarize the accuracy of this technique for the training and validation data sets. How do these results compare with the logis- tic regression results in questions 3 and 6? 8. What other techniques can you think for combining OA's and TML's fraud detec- tion questionnaires that might be beneficial to OATMLStep by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
