Question: Use Script 5.8 as a template to build and test two logistic regression models. USE R-STUDIO (4 Points) Build and test one model using the
- Use Script 5.8 as a template to build and test two logistic regression models. USE R-STUDIO
- (4 Points) Build and test one model using the following attributes: gender, chest.pain.type, X.colored.vessels, thal. Provide the confusion matrix.
Correct = _____
Incorrect = _____
Accuracy = _____%
- (4 Points) Build and test a second model using the following attributes: age, gender, chest.pain.type, maximum.heart.rate, peak, X.colored.vessels, thal. Provide the confusion matrix.
Correct = _______
Incorrect = _______
Accuracy = _______
- (1 Point) Compare the test set accuracy of the models to each other.
The __________ (first or second) model gives a better test set result. Whether the difference is statistically significant is an open question. The procedure for determining statistical significance for comparing models is a topic covered in Chapter 9 (Module #12).
- (4 Points) Repeat Script 5.11 below but use attribute nine as the lone input attribute. Next, build a second model by replacing attribute nine with attribute twelve. Build a final model using both attributes for input. Which model shows the best test set accuracy?
Nine only = _______%
Twelve only = _______%
Both nine and twelve = _______%
Attribute twelve _______ (is or is not) useful.
# Script 5.8 Logistic Regression: Cardiology Patient Data
#PREPROCESSING
set.seed(100) card.data <- CardiologyMixed #summary(card.data)
index <- sample(1:nrow(card.data), 2/3*nrow(card.data)) card.train <- card.data[index,] card.test <- card.data[-index,]
# CREATE AND ANALYZE LOGISTIC REGRESSION MODEL
card.glm <- glm(class ~ .,data = card.train,family= binomial(link='logit')) summary(card.glm) anova(card.glm, test="Chisq")
card.results <- predict(card.glm, card.test, type='response') card.table <- cbind(Pred=round(card.results,3),Class=card.test$class) card.table <- data.frame(card.table) head(card.table)
# healthy <= 0.5 sick > 0.5
# CREATING A CONFUSION MATRIX
card.results <- ifelse(card.results > 0.5,2,1) # > .5 a sick card.results card.pred <- factor(card.results,labels=c("Healthy","Sick")) card.pred my.conf <- table(card.test$class,card.pred,dnn=c("Actual","Predicted")) my.conf confusionP(my.conf)
# Script 5.11 Bayes CreditScreening with Attribute Selection
# PREPROCESSING set.seed(100) best <-GainRatioAttributeEval(class ~ ., data=creditScreening) best <- sort(best,decreasing = TRUE) round(best,3)
# CREATE THE MODEL
credit.Bayes<-naiveBayes(class ~ nine + ten + eleven + fifteen,laplace = 1, data= credit.train,type = "class") summary(credit.Bayes)
# CREATE CONFUSION MATRIX credit.pred <-predict(credit.Bayes, credit.test) credit.perf<- table(credit.test$class, credit.pred, dnn=c("actual", "Predicted")) credit.perf confusionP(credit.perf)
#############################################
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
