Question: ) Import the file containing Script 5.8 & 5.9 below into RStudio . Locate the line within Script 5.9 that appears as follows: #p.card 0.5,2,1)
- ) Import the file containing Script 5.8 & 5.9 below into RStudio. Locate the line within Script 5.9 that appears as follows: #p.card <- ifelse(p.card > 0.5,2,1). Remove the comment and execute both Script 5.8 and 5.9. You will notice that the inclusion of the ifelse takes the curves out of the ROC curve. This is the case as the decimal values within p.card are now all 1s and 2s.
- (1 Point) Copy and paste the ROC curve with a 0.5 cutoff for the sick class
- (5 Points) Change the 0.5 cutoff for the sick class to 0.4, then to 0.3, 0.2, 0.1, and finally 0.001 each time recording the value for auc.
| Value | auc |
| 0.5 | 0.822 |
| 0.4 |
|
| 0.3 |
|
| 0.2 |
|
| 0.1 |
|
| 0.001 |
|
- (5 Points) Repeat this process but use 0.6, 0.7, 0.8, 0.9, and lastly 0.999 for the cutoff value.
| Value | auc |
| 0.5 | 0.822 |
| 0.6 |
|
| 0.7 |
|
| 0.8 |
|
| 0.9 |
|
| 0.999 |
|
- (3 Points) Describe in detail the effect these changes have on the value of auc.
As the ifelse test for p.card approaches the value 1, the curve gets pushed down to the point where auc approaches _____. Likewise, as the ifelse test approaches the value 0, the auc approaches _____. In both cases, the TPR and FPR approach identical values so TPR/FPR _____.
# Script 5.8 Logistic Regression: Cardiology Patient Data
#PREPROCESSING
set.seed(100) card.data <- CardiologyMixed #summary(card.data)
index <- sample(1:nrow(card.data), 2/3*nrow(card.data)) card.train <- card.data[index,] card.test <- card.data[-index,]
# CREATE AND ANALYZE LOGISTIC REGRESSION MODEL
card.glm <- glm(class ~ .,data = card.train,family= binomial(link='logit')) summary(card.glm) anova(card.glm, test="Chisq")
card.results <- predict(card.glm, card.test, type='response') card.table <- cbind(Pred=round(card.results,3),Class=card.test$class) card.table <- data.frame(card.table) head(card.table)
# healthy <= 0.5 sick > 0.5
# CREATING A CONFUSION MATRIX
card.results <- ifelse(card.results > 0.5,2,1) # > .5 a sick card.results card.pred <- factor(card.results,labels=c("Healthy","Sick")) card.pred my.conf <- table(card.test$class,card.pred,dnn=c("Actual","Predicted")) my.conf confusionP(my.conf)
# Script 5.9 ROC CURVE AND AREA UNDER THE CURVE
# CREATE THE ROC CURVE
library(ROCR) p.card <-predict(card.glm, card.test, type="response") #p.card <- ifelse(p.card > 0.5,2,1) pr.card <- prediction(p.card, card.test$class) prf.card <- performance(pr.card, measure = "tpr", x.measure = "fpr") plot(prf.card)
# DETERMINE THE AREA UNDER THE ROC CURVE
auc <- performance(pr.card, measure = "auc") auc <- auc@y.values[[1]] auc
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
