Question: IN PYTHON ONLY For this case, you need to apply the decision rules and cutoff probability of 0 . 5 from Section 4 . 3

IN PYTHON ONLY
For this case, you need to apply the decision rules and cutoff probability of 0.5 from Section 4.3
Links to an external site.
to classify the two loans in Table 10 of the articleShould This Loan be Approved or Denied?: A Large Dataset with Class Assignment Guidelines
Links to an external site.
Links to an external site.
as higher risk or lower risk for loan approval by writing Python code to reproduce results (not format) in Tables 7(a),8,9 of this article using the SBA case data SBAcase.11.13.17.csv. The variable Selected indicates which observations are the training data and which are the testing data (1= training data to be used to build the model, 0= testing data to validate the model). Partition the data using this variable.
(a) Review Python documentation: statsmodels
Links to an external site.
and example code from the class.Fit a logistic regression model to reproduce results (not format) in Tables 7(a),8,9 of this article using the SBA case data SBAcase.11.13.17.csv by using STATMODELS sm.glm or smf.glm.
The logit model produces an estimated probability of being a 1. Classify as 1 if this estimated probability > cutoff, e.g.,0.5.The following code from Table 10.3 in the example code may be helpful to perform this classification:
logit = sm.GLM(train_y, train_X, family=sm.families.Binomial())
result= logit.fit()
predictions = result.predict(valid_X)
predictions_nominal =[0 if x <0.5 else 1 for x in predictions]
logit_result = pd.DataFrame({'actual': valid_y,
'p(0)': 1- predictions,
'p(1)': predictions,
'predicted': predictions_nominal })
(b) Refer to Table 8 of the article. Write the estimated equation that associates the outcome variable (i.e., default or not) with predictors RealEstate, Portion, and Recession, in three formats:
(i) The logit as a function of the predictors (see (10.6) of DMBA Chapter 10.2)
(ii) The odds as a function of the predictors (see (10.5) of DMBA Chapter 10.2)
(iii) The probability as a function of the predictors (see (10.2) of DMBA Chapter 10.2)
(c) Explain why risk indicators in Table 8 were selected using p-values in Table 7(a).
(d) Interpret parameter (coefficient) estimates of the model in Table 8 with a focus on the odds of default. Answer the following questions by interpreting parameter estimates of the model in Table 8 and specifying odds and probabilities of default for these risk indicators.
(i) Is a loan backed by real estate more likely or less likely to default (by how much)? Explain using parameter estimates.
(ii) Is a loan active during recession more likely or less likely to default (by how much)? Explain using parameter estimates.
(iii) How much does the portion of a loan guaranteed by SBA increase or decrease the likelihood of default? Explain using parameter estimates.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!