Question: Help with the confusionMatrix for accuracy on the validation/training sets in R! Here is my dataset code: # Read in the heart statlog file and
Help with the confusionMatrix for accuracy on the validation/training sets in R!
Here is my dataset code:
# Read in the heart statlog file and recode:
heart <- read.table("https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat", header = FALSE)
names(heart) <- c("age", "sex", "cpt", "rbp", "serum", "fbs", "rer", "mhr", "eia",
"oldpeak", "slope", "nmaj", "thal", "class")
recoded_heart <- heart %>%
mutate(sex = recode_factor(sex, "0" = "female", "1" = "male"),
cpt = recode_factor(cpt, "1" = "typical angina", "2" = "atypical angina",
"3" = "non anginal pain", "4" = "asymptomatic"),
fbs = recode_factor(fbs, "1" = "true", "0" = "false"),
rer = recode_factor(rer, "0" = "normal", "1" = "wave abnormal", "2" = "left hypertrophy"),
eia = recode_factor(eia, "0" = "no", "1" = "yes"),
slope = recode_factor(slope, "1" = "up", "2" = "flat", "3" = "down"),
thal = recode_factor(thal, "3" = "normal", "6" = "fixed defect", "7" = "reversable defect"),
class = recode_factor(class, "1" = "absence", "2" = "presence"))
recoded_heart$class = as.factor(recoded_heart$class)
Here is the code with creating the training/test and validation sets:
# Creating the training and validation set:
set.seed(1)
library(caTools)
split = sample.split(recoded_heart$class, SplitRatio = 0.8)
heart_train = subset(recoded_heart, split == TRUE)
heart_test = subset(recoded_heart, split == FALSE)
# Checking number of rows and column in each set
print(dim(heart_train))
print(dim(heart_test))
# Building the model
model_glm = glm(class~ . , family = "binomial",
data = heart_train, maxit = 100)
# Predictions on the validation set
predictTest = predict(model_glm, newdata = heart_test,
type = "response")
predicted.class <- transmute(data.frame(predictTest), class = ifelse
(predictTest > 0.5, "presence", "absence"))
# Confusion Matrix tells the accuracy and precision of the validation set:
print(confusionMatrix(predicted.class$class, recoded_heart$class))
# Confusion Matrix tells the accuracy and precision of the training set:
print(confusionMatrix(heart_train$class, recoded_heart$class))
I get an error about the arguments needing to have the same length. Please help!
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
