Question: Help with the confusionMatrix for accuracy on the validation/training sets in R! Here is my dataset code: # Read in the heart statlog file and

Help with the confusionMatrix for accuracy on the validation/training sets in R!

Here is my dataset code:

# Read in the heart statlog file and recode:

heart <- read.table("https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat", header = FALSE)

names(heart) <- c("age", "sex", "cpt", "rbp", "serum", "fbs", "rer", "mhr", "eia",

"oldpeak", "slope", "nmaj", "thal", "class")

recoded_heart <- heart %>%

mutate(sex = recode_factor(sex, "0" = "female", "1" = "male"),

cpt = recode_factor(cpt, "1" = "typical angina", "2" = "atypical angina",

"3" = "non anginal pain", "4" = "asymptomatic"),

fbs = recode_factor(fbs, "1" = "true", "0" = "false"),

rer = recode_factor(rer, "0" = "normal", "1" = "wave abnormal", "2" = "left hypertrophy"),

eia = recode_factor(eia, "0" = "no", "1" = "yes"),

slope = recode_factor(slope, "1" = "up", "2" = "flat", "3" = "down"),

thal = recode_factor(thal, "3" = "normal", "6" = "fixed defect", "7" = "reversable defect"),

class = recode_factor(class, "1" = "absence", "2" = "presence"))

recoded_heart$class = as.factor(recoded_heart$class)

Here is the code with creating the training/test and validation sets:

# Creating the training and validation set:

set.seed(1)

library(caTools)

split = sample.split(recoded_heart$class, SplitRatio = 0.8)

heart_train = subset(recoded_heart, split == TRUE)

heart_test = subset(recoded_heart, split == FALSE)

# Checking number of rows and column in each set

print(dim(heart_train))

print(dim(heart_test))

# Building the model

model_glm = glm(class~ . , family = "binomial",

data = heart_train, maxit = 100)

# Predictions on the validation set

predictTest = predict(model_glm, newdata = heart_test,

type = "response")

predicted.class <- transmute(data.frame(predictTest), class = ifelse

(predictTest > 0.5, "presence", "absence"))

# Confusion Matrix tells the accuracy and precision of the validation set:

print(confusionMatrix(predicted.class$class, recoded_heart$class))

# Confusion Matrix tells the accuracy and precision of the training set:

print(confusionMatrix(heart_train$class, recoded_heart$class))

I get an error about the arguments needing to have the same length. Please help!

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Can someone help me with this? Files from the previous projects can be downloaded from the links below: Anon File Links: https://anonfiles.com/Ead1s2abzb/HeartShapedBoxListApp_java...

Note: Quiz 2B in eMajor will be based on this Assignment. Please have your R program available in running condition when you take the quiz. You will need solutions of your program to take the quiz....

Problem 3 : Heart Disease Dataset In Problem 3 , you will be working with the Statlog Heart Disease Dataset. This dataset contains medical information about 2 7 0 individuals, including a column that...

use the code r Script below to Answer the questions from number 3 to 7 Questions : 3. Model #1 - First Logistic Regression Model Reporting Results Report the results of the regression model. Address...

Hi, Can you please help me with assignment, I am failing to create the train_nn function. Please advise how I can get data to you, my previous efforts have failed. Tensorflow_NeuralNetworkspdf May 1,...

sklearn jupyther show examples of the steps because cant post data tables Question2:predict the probability of Heart Disease Write and submit your python codes in "Jupyter Notebook" to perform the...

please do it on python The entire solution needs to be decomposed into several functions where data that needs to be passed between main and other functions should be passed as arguments to the...

Table of Contents Introduction. Hypothesis. Methods ..5 148 194714) Results.. Table I Western Governor Township Race by Family History of Heart Disease. Table 3 Analysis of Variance Difference in...

Download the most recent financial statements of Medtronic and Boston Scientific.From the attached ratio sheet pick 6 relevant ratios and compare the two companies.Then from the ratios you selected...

Confirming Pages C H A P T E R 19 Analyzing Information and Writing Reports Chapter Outline Using Your Time Efficiently Analyzing Data and Information for Reports Identifying the Source of the Data...

Conduct research and write an essay on: The framework for financial reporting in the UAE. Your report should address among other things: The generally accepted accounting principles among UAE...

What did Enron do that was considered so unethical? How did Enron violate the law? What were some of their accounting violations? What was the overall ethical climate at Enron? Was the accounting...

5 Use your answers to Question 4 to factorise these expressions. a 6x+3= C 10y-2= b 12x+4= d 24y-6=

Please explain reasoning for each step in depth as well as the calculations, I am trying to learn these concepts and confused about where to start. Thank you!

In the Data Source View in Visual Studio, what option is available to view data in any Source View Table? What are the primary uses this capability?

What Microsoft Analysis Services Extension for Visual Studio 2017 needs to be installed before beginning work on a Multidimensional OLAP Cube Project? How can the installation be verified?

Why would the FedScope Employment database be more representative of the General Population in terms of Salary Data than the CPS studies?