Reconsider Problem 3.6. Partition the historical records into a training partition (60 percent of the 800 records)

Question:

Reconsider Problem 3.6. Partition the historical records into a training partition (60 percent of the 800 records) and a validation partition (the remaining 40 percent of the 800 records). 

a. Determine the accuracy, sensitivity, and specificity for the KNN model with k = 5 (based on the historical data in the training partition), when it is used to classify applicants from the validation partition as likely to graduate (or not), based on a 50 percent cutoff. 

b. Repeat part a with the classification tree model, with a maximum of seven splits as the stopping rule and using the best pruned tree. 

c. Repeat part a with the naïve Bayes model using the data on the Binned Data worksheet tab. 

d. Repeat part a with the logistic regression model. 

e. Comment on which model(s) perform best.

 

Data from Problem 3.6.

As first described in Problem 2.9, Pathfinder College is a small liberal arts college that wants to improve its admissions process. In particular, too many of its incoming freshmen have failed to graduate for a variety of reasons, including dropping out, or transferring to another college, or failing to satisfy the graduation standards. Therefore, the admissions office wants to have a better method of predicting whether an applicant would succeed in graduating. The college primarily uses two predictor variables for evaluating an applicant, namely, the high school grade point average and the SAT score. If either is sufficiently high (either a GPA ≥ 3.30 or an SAT score ≥ 1200), the applicant is immediately accepted. Such applicants who accept admission usually do well until successfully graduating. However, the admissions office needs to go deeper into its applicant pool to fill up its freshman class. The question is how to predict which of these other applicants are the better bet for successfully graduating. The admissions office wishes to consider an applicant for admission only if the prediction is that the applicant is more likely than not of successfully graduating. 

The admissions office has randomly selected 800 students who were previously admitted and enrolled but did not meet the criteria for immediate acceptance. All have now had sufficient time to graduate. The data for these historical records, including their graduation status, are provided in the spreadsheet file titled Pathfinder College Data available in www.mhhe.com/ Hillier7e. The admissions office currently is evaluating the three applicants listed below and so wishes a prediction for each one of the likelihood of eventually graduating. 

Using all the data (unpartitioned) on the Clean Data worksheet tab, apply the KNN algorithm to this problem, with k = 7 with the data rescaled using standardization. Assuming an applicant is classified as a likely success (and admitted) if they are at least 50 percent likely to graduate, classify each of the following applicants and indicate the estimated probability of graduating. 

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Question Posted: