Reconsider Problem 3.17. Partition the 4,985 historical records of interest into a training partition (60 percent of

Question:

Reconsider Problem 3.17. Partition the 4,985 historical records of interest into a training partition (60 percent of the records) and a validation partition (the remaining 40 percent of the records). 

a. Determine the MAD and MSE when using the KNN algorithm with k = 10 to predict the number of late payments of applicants from the validation partition. 

b. Repeat part a with the regression tree algorithm, with a maximum of seven splits as the stopping rule and using the full tree. 

c. Repeat part a with the multiple linear regression model. 

d. Comment on which model(s) perform best.


Data from Problem 3.17. 

As first described in Problem 2.16, Friendly Bank is very active with making loans to deserving people in the local community. However, the bank does need to carefully evaluate each loan to make sure that the recipient of the loan will likely repay the loan as scheduled. Therefore, the bank needs to obtain a prediction of whether this is likely and what the probability is. The bank primarily uses the annual income and the credit rating of the person applying for the loan as the predictor variables for obtaining this prediction. The bank has compiled all of the historical records of substantial loans and their outcomes over recent years. This information is provided in the spreadsheet titled Friendly Bank Data available in www.mhhe.com/Hillier7e. Only loans that have concluded (either paid off in full or ending in default) are included, resulting in 4,985 total records. The bank currently is evaluating the three loan applications described below. Using all the data (unpartitioned) on the Clean Data worksheet tab with the data rescaled using standardization, apply the KNN algorithm with k = 10 to this problem to classify each of the following applicants as either likely to default (defined as more than a 10 percent chance of default) or not likely to default (defined as a 10 percent chance of default or less). Also indicate the estimated probability of default for each.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Question Posted: