You are a data scientist recently hired by Universal Bank, a mid-sized bank in the southern United

Question:

You are a data scientist recently hired by Universal Bank, a mid-sized bank in the southern United States. Most of your work to this point has involved pulling reports fromdatabases, but now you have been given a more interesting task. The bank is facing competition from online lenders that can offer rapid automated loan approvals, and it wants to develop its own predictive model so that it can do likewise. Before building the web infrastructure, which could be costly, the bank wants to pilot a prototype loan approval model, developed by you.
The bank wants to launch its model with the approval process for personal loans extended to existing customers. The bank has only been offering these loans for a relatively short time, so has little data on default rates. It does have data on prior loan applications and whether they were approved or disapproved.
You have done some preliminary data prep and feature selection work, resulting in the dataset of 5000 records for this project. Each record is for a customer and consists of feature values for that customer along with a record of the human decision on their loan application. Your end goal is to develop a model to predict that human decision and an accompanying report to the bank’s chief lending officer.
Regulatory Requirements You are somewhat familiar with regulatory requirements with respect to discrimination (do a web search for the US Department of Justice Equal Credit OpportunityAct or ECOA). Bank attorneys have told you that the ECOArequirements pertain to the basis for credit decisions and do not mean that the proportion of approved loans must be the same for all groups.

1. Identify any features that might need to be excluded from the modeling task, per the ECOA.
2. Should you simply eliminate these features from the data?
3. Explore the data, with a focus on loan approval rates for different groups.
4. Split the data into training and validation data, and fit several models of your choice to predict whethera personal loan should be approved, using only permitted features. There are a number of possible performance measures; evaluate the model performance by the metric(s) you consider useful.
5. Assess the usefulness of the model from a pure model performance standpoint.
6. Considering the “protected” categories per the ECOA, evaluate whether the model is fair.
7. Describe steps that might be taken to improve the model fairness. Implement any measures that can be taken without going beyond the dataset at hand.
8. Write a very short report summarizing your findings.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Machine Learning For Business Analytics

ISBN: 9781119828792

1st Edition

Authors: Galit Shmueli, Peter C. Bruce, Amit V. Deokar, Nitin R. Patel

Question Posted: