Question: ALL INFORMATION IS PROVIDED! USE R- CODE. ANYTHING IN BOLD IS R-CODE THAT IS PROVIDED. ##### Problem 1 Consider the business school admission data available
ALL INFORMATION IS PROVIDED! USE R- CODE. ANYTHING IN BOLD IS R-CODE THAT IS PROVIDED.
##### Problem 1
Consider the business school admission data available in `admission.csv`. The admission officer of a business school has used an "index" of undergraduate grade point average ($X_1$=`GPA`) and graduate management aptitude test ($X_2$=`GMAT`) scores to help decide which applicants should be admitted to the school's graduate programs. This index is used to categorize each applicant into one of three groups - `admit` (group 1), `do not admit` (group 2), and `borderline` (group 3).
1. First let's import the data set using the `read.csv()` function.
```{r}
library(caret) # load the caret package
admData <- read.csv('admission.csv') # change the path to the file if needed ```
2. Now, let's create 10 folds to be used by our models. This is done so that all the models are fit and tested on the same two sets of data points.
```{r}
set.seed(123)
# for reproducibility of results - don't remove this line
testInd <- createFolds(admData$Group,k=10)
```
For example,
```
{r}
testInd[[1]]
# 9 11 14 46 51 59 75 83
```
stores the indices of the test points of the 1st fold.
a. Using the `train()` function of the `caret` package, fit `Multinomial Logistic Regression` 10 times where each time you exclude the data points from the $k$-th fold during model training. Set your `method` argument to "multinom". Since the response variable `Group` is coded as numeric (with values 1,2 and 3), convert it into a factor variable using the `as.factor()` function during model fitting. Compute an estimate for the test `Accuracy` (i.e. the accuracy on the test set) using 10-fold cross validation.
b Repeat part (a) for `LDA` setting the `method` argument to "lda".
c. Repeat part (a) for `QDA` setting the `method` argument to "qda".
d. Repeat part (a) for `Naive Bayes` setting the `method` argument to "nb".
e. Repeat part (a) for `KNN` with $K=1,2,\ldots,10$ and setting the `method` argument to "knn". In this case, standardize your data prior to fitting the `KNN` model. Choose the optimal value of $K$ using 5-fold cross validation (_hint_: set `trControl=trainControl(method='cv',number = 5)` within the `train()` function).
f. In a single table report the **cross validation based** estimates of the test `Accuracy` for each model. Based on the results which model would you recommend?
admissions.csv
| GPA | GMAT | Group |
| 2.96 | 596 | 1 |
| 3.14 | 473 | 1 |
| 3.22 | 482 | 1 |
| 3.29 | 527 | 1 |
| 3.69 | 505 | 1 |
| 3.46 | 693 | 1 |
| 3.03 | 626 | 1 |
| 3.19 | 663 | 1 |
| 3.63 | 447 | 1 |
| 3.59 | 588 | 1 |
| 3.3 | 563 | 1 |
| 3.4 | 553 | 1 |
| 3.5 | 572 | 1 |
| 3.78 | 591 | 1 |
| 3.44 | 692 | 1 |
| 3.48 | 528 | 1 |
| 3.47 | 552 | 1 |
| 3.35 | 520 | 1 |
| 3.39 | 543 | 1 |
| 3.28 | 523 | 1 |
| 3.21 | 530 | 1 |
| 3.58 | 564 | 1 |
| 3.33 | 565 | 1 |
| 3.4 | 431 | 1 |
| 3.38 | 605 | 1 |
| 3.26 | 664 | 1 |
| 3.6 | 609 | 1 |
| 3.37 | 559 | 1 |
| 3.8 | 521 | 1 |
| 3.76 | 646 | 1 |
| 3.24 | 467 | 1 |
| 2.54 | 446 | 2 |
| 2.43 | 425 | 2 |
| 2.2 | 474 | 2 |
| 2.36 | 531 | 2 |
| 2.57 | 542 | 2 |
| 2.35 | 406 | 2 |
| 2.51 | 412 | 2 |
| 2.51 | 458 | 2 |
| 2.36 | 399 | 2 |
| 2.36 | 482 | 2 |
| 2.66 | 420 | 2 |
| 2.68 | 414 | 2 |
| 2.48 | 533 | 2 |
| 2.46 | 509 | 2 |
| 2.63 | 504 | 2 |
| 2.44 | 336 | 2 |
| 2.13 | 408 | 2 |
| 2.41 | 469 | 2 |
| 2.55 | 538 | 2 |
| 2.31 | 505 | 2 |
| 2.41 | 489 | 2 |
| 2.19 | 411 | 2 |
| 2.35 | 321 | 2 |
| 2.6 | 394 | 2 |
| 2.55 | 528 | 2 |
| 2.72 | 399 | 2 |
| 2.85 | 381 | 2 |
| 2.9 | 384 | 2 |
| 2.86 | 494 | 3 |
| 2.85 | 496 | 3 |
| 3.14 | 419 | 3 |
| 3.28 | 371 | 3 |
| 2.89 | 447 | 3 |
| 3.15 | 313 | 3 |
| 3.5 | 402 | 3 |
| 2.89 | 485 | 3 |
| 2.8 | 444 | 3 |
| 3.13 | 416 | 3 |
| 3.01 | 471 | 3 |
| 2.79 | 490 | 3 |
| 2.89 | 431 | 3 |
| 2.91 | 446 | 3 |
| 2.75 | 546 | 3 |
| 2.73 | 467 | 3 |
| 3.12 | 463 | 3 |
| 3.08 | 440 | 3 |
| 3.03 | 419 | 3 |
| 3 | 509 | 3 |
| 3.03 | 438 | 3 |
| 3.05 | 399 | 3 |
| 2.85 | 483 | 3 |
| 3.01 | 453 | 3 |
| 3.03 | 414 | 3 |
| 3.04 | 446 | 3 |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
