Consider the 2013 declined loan data from LendingClub titled RejectStatsB2013. Similar to the analysis done in the

Question:

Consider the 2013 declined loan data from LendingClub titled “RejectStatsB2013.” Similar to the analysis done in the chapter, let’s scrub the employment length. Because our analysis requires risk scores, debt-to-income data, and employment length, we need to make sure each of them has valid data.

a. Sort the file based on employment length and remove those observations (the complete row or record) that have a missing score (“NA”) or a score of zero.
b. Sort the file based on debt-to-income and remove those observations (the complete row or record) that have a missing score, a score of zero, or a negative score.
c. Sort the file based on risk score and remove those observations (the complete row or record) that have a missing score or a score of zero.
d. There should now be 669,993 observations. Any thoughts on what biases are imposed when we remove observations? Is there another way to do this?
e. Run a PivotTable analysis to show the number of Excellent Risk Scores but High DTI Bucket loans in each Employment year bucket. Any interpretation of why these loans were declined based on employment length?

Fantastic news! We've Found the answer you've been seeking!