Question: ) Import the file containing Script 5.8 & 5.9 below into RStudio . Locate the line within Script 5.9 that appears as follows: #p.card 0.5,2,1)

  1. ) Import the file containing Script 5.8 & 5.9 below into RStudio. Locate the line within Script 5.9 that appears as follows: #p.card <- ifelse(p.card > 0.5,2,1). Remove the comment and execute both Script 5.8 and 5.9. You will notice that the inclusion of the ifelse takes the curves out of the ROC curve. This is the case as the decimal values within p.card are now all 1s and 2s.

  1. (1 Point) Copy and paste the ROC curve with a 0.5 cutoff for the sick class

  1. (5 Points) Change the 0.5 cutoff for the sick class to 0.4, then to 0.3, 0.2, 0.1, and finally 0.001 each time recording the value for auc.

Value

auc

0.5

0.822

0.4

0.3

0.2

0.1

0.001

  1. (5 Points) Repeat this process but use 0.6, 0.7, 0.8, 0.9, and lastly 0.999 for the cutoff value.

Value

auc

0.5

0.822

0.6

0.7

0.8

0.9

0.999

  1. (3 Points) Describe in detail the effect these changes have on the value of auc.

As the ifelse test for p.card approaches the value 1, the curve gets pushed down to the point where auc approaches _____. Likewise, as the ifelse test approaches the value 0, the auc approaches _____. In both cases, the TPR and FPR approach identical values so TPR/FPR _____.

# Script 5.8 Logistic Regression: Cardiology Patient Data

#PREPROCESSING

set.seed(100) card.data <- CardiologyMixed #summary(card.data)

index <- sample(1:nrow(card.data), 2/3*nrow(card.data)) card.train <- card.data[index,] card.test <- card.data[-index,]

# CREATE AND ANALYZE LOGISTIC REGRESSION MODEL

card.glm <- glm(class ~ .,data = card.train,family= binomial(link='logit')) summary(card.glm) anova(card.glm, test="Chisq")

card.results <- predict(card.glm, card.test, type='response') card.table <- cbind(Pred=round(card.results,3),Class=card.test$class) card.table <- data.frame(card.table) head(card.table)

# healthy <= 0.5 sick > 0.5

# CREATING A CONFUSION MATRIX

card.results <- ifelse(card.results > 0.5,2,1) # > .5 a sick card.results card.pred <- factor(card.results,labels=c("Healthy","Sick")) card.pred my.conf <- table(card.test$class,card.pred,dnn=c("Actual","Predicted")) my.conf confusionP(my.conf)

# Script 5.9 ROC CURVE AND AREA UNDER THE CURVE

# CREATE THE ROC CURVE

library(ROCR) p.card <-predict(card.glm, card.test, type="response") #p.card <- ifelse(p.card > 0.5,2,1) pr.card <- prediction(p.card, card.test$class) prf.card <- performance(pr.card, measure = "tpr", x.measure = "fpr") plot(prf.card)

# DETERMINE THE AREA UNDER THE ROC CURVE

auc <- performance(pr.card, measure = "auc") auc <- auc@y.values[[1]] auc

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!