Question: Use the Kaggle Credit Card Data set for this exercise. Use 100K and the entire data set representing fraudulent and non-fraudulent data. Use the same

 Use the Kaggle Credit Card Data set for this exercise. Use

Use the Kaggle Credit Card Data set for this exercise. Use 100K and the entire data set representing fraudulent and non-fraudulent data. Use the same approach to generate test and training data sets as in the previous assignment. Perform ridge and lasso to reduce the input feature set. Use the reduced feature set to rerun the logistic regression. Identify the reduced input feature set. Compare with the raw logistic regression. The total accuracy for the comparison is not a good measure. Explain why. Use other measures to compare the two models. As explained in class, this credit card data set is unbalanced. Read https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf for a discussion of how to handle unbalanced data sets. Make a powerpoint presentation of the technique used with unbalanced data in the paper https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf. Use the ROSE package discussed adjust for the imbalance in the credit fraud data. Run logistic regression with the new data set. Also check https://cran.r-project.org/web/packages/ROSE/ROSE.pdf for a more concise explanation Use the Kaggle Credit Card Data set for this exercise. Use 100K and the entire data set representing fraudulent and non-fraudulent data. Use the same approach to generate test and training data sets as in the previous assignment. Perform ridge and lasso to reduce the input feature set. Use the reduced feature set to rerun the logistic regression. Identify the reduced input feature set. Compare with the raw logistic regression. The total accuracy for the comparison is not a good measure. Explain why. Use other measures to compare the two models. As explained in class, this credit card data set is unbalanced. Read https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf for a discussion of how to handle unbalanced data sets. Make a powerpoint presentation of the technique used with unbalanced data in the paper https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf. Use the ROSE package discussed adjust for the imbalance in the credit fraud data. Run logistic regression with the new data set. Also check https://cran.r-project.org/web/packages/ROSE/ROSE.pdf for a more concise explanation

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!